Guest sperugin Posted January 18, 2004 Posted January 18, 2004 Hello, Has anyone had any success breaking up the category RDF dump into sub-categories? The dump is just too big to work with directly. I need to extract certain information from the category dump, but the PHP script I use to do so is taking too long to run (~10 hours). Therefore, I wrote the following XSLT transformation to break the dump into sub-branches (e.g., Arts, Games, News) prior to running my PHP script. <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:r="http://www.w3.org/TR/RDF/" xmlns:d="http://purl.org/dc/elements/1.0/" xmlns="http://dmoz.org/rdf" version="1.0"> <xsl:output method="xml"/> <xsl:template match="node()[name() = 'Topic' and not(starts-with(@r:id, 'Top/News'))]"/> <!-- matches any node, including the root --> <xsl:template match="*|@*"> <xsl:copy> <!-- continues on any nodes except the root --> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet> However, when I execute this transformation I run out of memory. I am running it on an Apple PowerBook G4. Does anyone have any methods of breaking up the category dump which they would like to share? Thanks, Saverio
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now