Video Screencast Help

{CWoC} Streamlining the to Have Very Fast SIM Load Time!

Created: 22 Dec 2009 • Updated: 25 Dec 2009 • 2 comments
Ludovic Ferre's picture
+1 1 Vote
Login to vote

As posted earlier today the file is now available for download in a compress form, so the download time can be much faster than using the raw xml format.

As I checked this I also found that the loading of the Symantec Installation Manager UI was still not very fast. Just under one minute on my VM:


Given I've monitored the pl.xml file since March 2009 (when it was still in beta) I have some thought on what could help to improve the performance (I posted a couple of ideas on the subject: make-ns7-symantecplxml-file-easier-download-and-parse-splitting-it-smaller-files and break-down-large-or-regular-updates-small-files-and-use-differential-instead-bulk-download-or-i).

So I wanted to do some cleaning on the to take out all of the not-used-by-me localised string. Grep and maual operations are not an option given we have multilne entries with no matchable string, and 230,000+ lines in the xml file!

But given I've spent the last 4 weeks working with large files (200MB, 800~900K lines) for extracting interesting data from IIS log files (see {CWoC} and aila) I thought a C program would be the best answer. And I was right. In less than an hour I wrote and tested a simple program to clean up the xml as stated above.

It does so very efficiently as we can see here:

ludovic@ub-x64:~/ns7pl$ time ./clean_pl_xml c335cc1fb3abe965ca54292fd2379928 >
Starting...We have counted 59731 lines.

real 0m0.219s
user 0m0.200s
sys 0m0.000s


And it allows the SIM UI to load in half the current time: 24 seconds !!!

But you have to point SIM (in the registry) to a local web server where the file is available of course)

So here is the source code (hopefully nice and sweat and short and to the point) of the program with the new xml zipped attached to this blog.

 * { Connect Winter of Code: SIM pl.xml string Cleanup  }
 * Author: Ludovic FERRE,
 * {CWoc} info:
 Clean up the symantec pl.xml file of unwanted localisation. Look for string tag
 (open) then skip the first language (ENU) and display (or !) the l10n data from
 the current line.
#include <stdio.h>
#include <string.h>
#include <stdbool.h>

		 0 -> Initial state
		 1 -> In string
		 2 -> In language enu
		 4 -> In language post-enu

bool print_line ( char * l, int * stack );

int main ( int argc, char * argv[] ) {

	char line [ 1024 ];
	int line_count = 0;
	int search_stack = 0;
	fputs("Starting...", stderr);
	FILE *fp = fopen ( argv[1], "r" );
	if ( fp != NULL) {
		while ( ( fgets ( line, sizeof line, fp )) != NULL ) {
			if ( print_line( line, & search_stack )){
				line_count ++;
				printf("%s", line);
	fprintf(stderr, "We have counted %d lines.\n", line_count);
	fputs ("...ending.", stderr);
	return 0;

bool print_line ( char * line, int * stack ) {
	// return false when we are in a language tag (but not enu)

	char * p = NULL;
	char * q = NULL;

	switch ( * stack ){
		case 0:	{	//Initial state -> search for the string open tag "<string "
			p = strstr(line, "<string ");
			if ( p != NULL)	{
				* stack = 1;
			return true;
		case 1:	{	//In string -> search for language open tag & enu
			p = strstr(line, "<language ");
			q = strstr(line, "\"ENU\"" );
			if ( p != NULL & q != NULL ) {
				// We are inside language enu
				* stack = 2;
				goto check_single_line;
			return true;
		case 2:	{	//In language enu
			p = strstr(line, "</language>");
			if ( p != NULL ){
				// Leaving enu
				* stack = 4;
			return true;
		case 4:	{	//In language post enu,
			p = strstr(line, "</string>");
			if ( p != NULL ){
				// Leaving string
				* stack = 0;
				return true;
			else return false;
	return true;

Comments 2 CommentsJump to latest comment

e098571's picture


I'm having a pl.xml which is publicly available which is around 33MB in size. I just want to create the pl.xml file using SPLE such that our solution i dependent on SMP 7.1.6797 MP1, but since the pl.xml file is such a huge the SPLE is failing to import it and taking lot of time. So is there any way by which I can just extract/take the SMP 7.1.6797 related part into other pl.xml from this huge file?



Login to vote
Ludovic Ferre's picture

I guess you need to engage with the Platform team responsible for this product.

I have no idea how the SPLE tool works, so I'll take the opportunity here to get some details: does it produce a new pl.xml with your added entries, or does it create a diff of sort.

I can't see how the first part should work, but then having seen the diff's between versions of the file it also looks very ugly.

I don't understand how mixing product relationship, icons and package source in a single file came about as a good idea, but that's me (and I'm quite opinionated when it comes to layered abstraction opacity and xml - I just hate the mis-use of it).

I am currently off-net, on a retreat of some kind. I'll be back real soon, and you sure will hear from me then ;-).

Ludovic FERRÉ
Principal Remote Product Specialist

Login to vote