Jump to content

Return of the XML Wikimedia Projects Dumps

From Wikimania

Return of the XML Wikimedia Projects Dumps

Presenter Tomasz Finc
Themes Technology
About the presenter
Tomasz Finc has been a full-time software developer in our San Francisco office since August 2008. He works on general MediaWiki development, code review, systems and administration tools, and optimization. He helps to support the San Francisco office.

XML dumps are very important to the community as they provide a ready dataset for researchers, developers, & administrators who all use them in various manners. Sadly, the dumps as they currently run have grown past their initial design and are broken. We can no longer say that they will run with reliability, consistency and predictability. I'd like lead a presentation on the rework of our overall infrastructure and provide insights on what worked, what didn't and where we are at.

Language English