{CWoC} Streamlining the Symantec.pl.xml to Have Very Fast SIM Load Time!
As posted earlier today the Symantec.pl.xml file is now available for download in a compress form, so the download time can be much faster than using the raw xml format.
As I checked this I also found that the loading of the Symantec Installation Manager UI was still not very fast. Just under one minute on my VM:
Given I've monitored the pl.xml file since March 2009 (when it was still in beta) I have some thought on what could help to improve the performance (I posted a couple of ideas on the subject: make-ns7-symantecplxml-file-easier-download-and-parse-splitting-it-smaller-files and break-down-large-or-regular-updates-small-files-and-use-differential-instead-bulk-download-or-i).
So I wanted to do some cleaning on the symantec.pl.xml to take out all of the not-used-by-me localised string. Grep and maual operations are not an option given we have multilne entries with no matchable string, and 230,000+ lines in the xml file!
But given I've spent the last 4 weeks working with large files (200MB, 800~900K lines) for extracting interesting data from IIS log files (see {CWoC} and aila) I thought a C program would be the best answer. And I was right. In less than an hour I wrote and tested a simple program to clean up the xml as stated above.
It does so very efficiently as we can see here:
ludovic@ub-x64:~/ns7pl$ time ./clean_pl_xml c335cc1fb3abe965ca54292fd2379928 > symantec.pl.xml Starting...We have counted 59731 lines. ...ending. real 0m0.219s user 0m0.200s sys 0m0.000s ludovic@ub-x64:~/ns7pl$
And it allows the SIM UI to load in half the current time: 24 seconds !!!
But you have to point SIM (in the registry) to a local web server where the file is available of course)
So here is the source code (hopefully nice and sweat and short and to the point) of the program with the new xml zipped attached to this blog.
/******************************************************************************
* { Connect Winter of Code: SIM pl.xml string Cleanup }
* Author: Ludovic FERRE, http://www.symantec.com/connect/blogs/ludovicferre
* {CWoc} info: http://www.symantec.com/connect/search/apachesolr_...
******************************************************************************/
/*******************************************************************************
Clean up the symantec pl.xml file of unwanted localisation. Look for string tag
(open) then skip the first language (ENU) and display (or !) the l10n data from
the current line.
******************************************************************************/
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
/* SEARCH STACK VALUES
0 -> Initial state
1 -> In string
2 -> In language enu
4 -> In language post-enu
*/
bool print_line ( char * l, int * stack );
int main ( int argc, char * argv[] ) {
char line [ 1024 ];
int line_count = 0;
int search_stack = 0;
fputs("Starting...", stderr);
FILE *fp = fopen ( argv[1], "r" );
if ( fp != NULL) {
while ( ( fgets ( line, sizeof line, fp )) != NULL ) {
if ( print_line( line, & search_stack )){
line_count ++;
printf("%s", line);
}
}
fclose(fp);
}
fprintf(stderr, "We have counted %d lines.\n", line_count);
fputs ("...ending.", stderr);
return 0;
}
bool print_line ( char * line, int * stack ) {
// return false when we are in a language tag (but not enu)
char * p = NULL;
char * q = NULL;
switch ( * stack ){
case 0: { //Initial state -> search for the string open tag "<string "
p = strstr(line, "<string ");
if ( p != NULL) {
* stack = 1;
}
return true;
break;
}
case 1: { //In string -> search for language open tag & enu
p = strstr(line, "<language ");
q = strstr(line, "\"ENU\"" );
if ( p != NULL & q != NULL ) {
// We are inside language enu
* stack = 2;
goto check_single_line;
}
return true;
break;
}
case 2: { //In language enu
check_single_line:
p = strstr(line, "</language>");
if ( p != NULL ){
// Leaving enu
* stack = 4;
}
return true;
break;
}
case 4: { //In language post enu,
p = strstr(line, "</string>");
if ( p != NULL ){
// Leaving string
* stack = 0;
return true;
}
else return false;
break;
}
}
return true;
}
The Endpoint Management Community Blog is the perfect place to share short, timely insights including product tips, news and other information relevant to the Endpoint Management community. Any authenticated Connect member can contribute to this blog.
Comments 2 Comments • Jump to latest comment
Hi
I'm having a pl.xml which is publicly available which is around 33MB in size. I just want to create the pl.xml file using SPLE such that our solution i dependent on SMP 7.1.6797 MP1, but since the pl.xml file is such a huge the SPLE is failing to import it and taking lot of time. So is there any way by which I can just extract/take the SMP 7.1.6797 related part into other pl.xml from this huge file?
Thanks
I guess you need to engage with the Platform team responsible for this product.
I have no idea how the SPLE tool works, so I'll take the opportunity here to get some details: does it produce a new pl.xml with your added entries, or does it create a diff of sort.
I can't see how the first part should work, but then having seen the diff's between versions of the file it also looks very ugly.
I don't understand how mixing product relationship, icons and package source in a single file came about as a good idea, but that's me (and I'm quite opinionated when it comes to layered abstraction opacity and xml - I just hate the mis-use of it).
Ludovic FERRÉ
Principal Remote Product Specialist
Symantec
Need help with IIS log files? Check out the self-service portal on http://aila.15-cloud.fr/
For a view
Would you like to reply?
Login or Register to post your comment.