Others have reported improved performance when using expat to parse Wikimedia dumps. We are currently using ElementTree which provides a good balance between usability and speed.
There is probably potential to speed up this library by switching to a faster xml parser. Candidates include:
Migrating to lxml or cElementTree might be relatively easy because they have similar APIs to ElementTree.
Others have reported improved performance when using expat to parse Wikimedia dumps. We are currently using ElementTree which provides a good balance between usability and speed.
There is probably potential to speed up this library by switching to a faster xml parser. Candidates include:
Migrating to lxml or cElementTree might be relatively easy because they have similar APIs to ElementTree.