Page 1 of 2

Back up forum's contents?

Posted: Mon Feb 04, 2008 3:15 am
by dch24
This is a question for Joe Strout or other admins:

Is there any way I can get a dump of the posts on the site?

I think the information here is valuable and should be available in some sort of archive. PHPBB uses a database backend, so a few SQL queries should do it.

I have no intention of setting up a second site. I'm just talking about geographically distributed backups. And if the answer is no, thanks anyway. :)

Posted: Mon Feb 04, 2008 10:56 am
by Mikos
I also hope that this site is backed up. Backups (in form of regularly created comprimated SQL dumps) distributed on servers around the world would be great. I can offer you space on my server for backups (server is in Czech Republic).

Posted: Mon Feb 04, 2008 6:26 pm
by dch24
To clarify, it would have to be a dump of only a few tables. I don't want people's passwords, email addresses, etc. Just posts.

Posted: Tue Feb 05, 2008 12:02 am
by Mikos
dch24 wrote:To clarify, it would have to be a dump of only a few tables. I don't want people's passwords, email addresses, etc. Just posts.
Of course :-)

Posted: Mon May 12, 2008 9:17 am
by Keegan
dch24 i was thinking the exact same thing !

1) I dont keep a journal or a blog. My posts here are the only record of my thoughts and ideas. Thus i would really like a copy just in case.

2) As i said to Dr Nebel, my thoughts arent "is this going to work" but "whats going to happen next. Thus i would really like a copy just in case.


So whats the best way to do this ? I think exporting pure BB code might be a bit extreme but i would proudly keep a copy. It would be nice if it could appear in a browser like we are seeing it now. I know there are few programs that are designed to save websites. Any suggestions guys ?

Posted: Mon May 12, 2008 6:47 pm
by dch24
It would be fairly simple to use curl or wget and spider the website (creating a "working" copy that's not a forum, just static HTML). Instead of all of us DDOSing the forum with cron jobs, it might make more sense to ask the administrators what they would like.

Joe? MSimon?

Posted: Mon May 12, 2008 9:17 pm
by MSimon
You will have to ask Joe. I'm just the anti-spam guardian and occasional code clean up guy (to get pictures to display etc).

Posted: Tue May 13, 2008 3:10 am
by JoeStrout
I'm not opposed to somebody curling or wgetting the site now and then. I doubt it's going to amount to a serious load on the server.

Best,
- Joe

Posted: Tue May 13, 2008 3:51 pm
by dch24
Thanks, Joe :)

There are easy ways to control the "spidering" hit. For wget:
--quota to limit the amount downloaded
--wait to limit how fast requests hit the server
--limit-rate very rough, but on large files, it can limit network bandwidth

So if anyone feels like the server is being abused, please just say something. I'm currently sending requests from 64.38.220.4.

Posted: Tue May 13, 2008 7:45 pm
by MSimon
Joe will be best informed.

However I'll post back here if I notice any slow down in posting.

The best time would be from Midnight USA Central Time to about 8 AM GMT from what I have seen of traffic patterns. That is about a 3 hour window.

Posted: Wed May 14, 2008 6:47 am
by dch24
I ran the following command:

Code: Select all

wget -nv -EpkKm http://www.talk-polywell.org/bb/index.php
Then I compressed the output (1.1 GB) down to 63 MB and placed it at: http://polywell.nfshost.com/2008_05_13_ ... ks.tar.bz2

Since bandwidth and storage cost me, I'll leave it posted for a week. Keegan, Mikos, feel free to download a copy or ask for more time.

I tagged it "nooutsidelinks" because I think I can write a shell script to identify outside links (such as posts with pictures, PDFs, etc.) and add them to the next download. MSimon, I will be sure to run them after 5:00 AM GMT.

Does anyone mind if the next download is larger than 63 MB?

Posted: Wed May 14, 2008 1:38 pm
by MSimon
If you open an account here:

http://www.mediafire.com/

You should be able to park files up to 100 MB for no charge.

Posted: Mon Aug 18, 2008 3:40 pm
by derg
Can we do something here or is this indefinitely on the backburner?

Posted: Tue Aug 19, 2008 8:13 pm
by dch24
Hi derg, did you get a chance to download the backup while I had it posted?

I will post another one in a few months and leave it posted for a week or so.

Posted: Tue Aug 19, 2008 8:25 pm
by MSimon
dch24 wrote:Hi derg, did you get a chance to download the backup while I had it posted?

I will post another one in a few months and leave it posted for a week or so.
I don't know what Joe thinks but every two weeks or so would be better.