Login to participate
Endpoint Management & Virtualization IdeasRSS
1

Make NS7 symantec.pl.xml file easier to download and parse by splitting it into smaller files

ludovic_ferre's picture

 Since the original release of Notification Server 7 (and even a little before that during the beta cycle) I download the symantec.pl.xml file daily to review if the content has changed. In fact I do not do this, I create a shell file to do it in my stead - it's so much more efficient than I could ever be ;).

In all cases, after running this for a few month now I have the following data structure on my server (note that the files are named after the md5 hash value of the symantec.pl.xlm):

ludovic@ub-x64:~/ns7pl$ ls -lArth
total 364M
-rw-r--r-- 1 ludovic ludovic 8.3M 2009-03-16 18:23 3f29324957727373e54f2fa70c23d055
-rw-r--r-- 1 ludovic ludovic 9.0M 2009-03-24 19:39 5149d004db29ec53bec0e4ffc8db711c
-rw-r--r-- 1 ludovic ludovic 9.2M 2009-04-02 14:24 899225afdfd68ed4da113e18e77e350f
-rw-r--r-- 1 ludovic ludovic 10M 2009-04-09 11:33 bc4e1a9ed5312059d72b07bcd4e019ae
-rw-r--r-- 1 ludovic ludovic 13M 2009-04-28 12:42 83a827d8150a2e7bf93d0e033f2db982
-rw-r--r-- 1 ludovic ludovic 13M 2009-04-29 00:28 cc1cfc81bb90b0e6d9ea830ee01e404b
-rw-r--r-- 1 ludovic ludovic 13M 2009-05-10 17:14 e977aaf8f307173784e065b8bd2183b0
-rw-r--r-- 1 ludovic ludovic 13M 2009-05-13 19:36 81145e96a0f13a52f75af905b136057e
-rw-r--r-- 1 ludovic ludovic 13M 2009-05-14 22:15 f6665d8337819f8930283b26f16785c9
-rw-r--r-- 1 ludovic ludovic 13M 2009-05-20 19:41 383658cef3290cb182dd067c8c828e2c
-rw-r--r-- 1 ludovic ludovic 13M 2009-06-12 20:50 8df13530c17605b9fc6fc5f4c18f67f9
-rw-r--r-- 1 ludovic ludovic 13M 2009-06-24 03:51 a8a83adc3042a926ef3b4e2376981ffb
-rw-r--r-- 1 ludovic ludovic 13M 2009-06-24 18:30 df45c391a1a81c6389a9f433d62a3af4
-rw-r--r-- 1 ludovic ludovic 14M 2009-07-23 21:01 193306acabb1f7a010bbf2ff9c905c29
-rw-r--r-- 1 ludovic ludovic 14M 2009-08-01 15:33 ca3fa3a4a117b170e2872cbe013b15d6
-rw-r--r-- 1 ludovic ludovic 15M 2009-08-28 05:24 c1a0907a00f7f808f52f4cd8d2e5a55e
-rw-r--r-- 1 ludovic ludovic 15M 2009-09-01 07:49 a1074cd323cb1736fcd7494289efe712
-rw-r--r-- 1 ludovic ludovic 16M 2009-09-02 16:25 f719d1be4fbfd0ba38c82a4f27c6ce14
-rw-r--r-- 1 ludovic ludovic 16M 2009-09-04 18:14 a59299cd05e5190ee55bf7d61fd36df8
-rw-r--r-- 1 ludovic ludovic 16M 2009-09-11 07:02 e3360e17d088e30507059af15da08c81
-rw-r--r-- 1 ludovic ludovic 16M 2009-09-19 00:11 91abee93d5d4990d8a34fe967affe14d
-rw-r--r-- 1 ludovic ludovic 17M 2009-09-21 08:40 7b25d9a3f4b2c3469348ee1571689302
-rw-r--r-- 1 ludovic ludovic 16M 2009-09-21 18:17 3aed6ef02a0a515443095fdff0ea0475
-rw-r--r-- 1 ludovic ludovic 17M 2009-09-24 17:21 d2672dbb801ffe452b1a2a7b91cd4aa2
-rw-r--r-- 1 ludovic ludovic 17M 2009-10-09 08:00 75146a8f826edbc5a6e6c8f8372fbfd6
-rw-r--r-- 1 ludovic ludovic 17M 2009-10-18 19:58 6696be550bd1b77c6b0bda23af25e84b
-rwxr-xr-x 1 ludovic ludovic 1.4K 2009-10-22 17:14 get7pl.sh
-rw-r--r-- 1 ludovic ludovic 17M 2009-10-30 08:36 2bad24ba55f02ecb640078144f2c3861
-rw-r--r-- 1 ludovic ludovic 33 2009-10-30 16:03 latest_pl
ludovic@ub-x64:~/ns7pl$

We can follow the progression in file size of the pl (product list) xml file over time: it started just under 8.3 MB in March 2009 and inflated to about 17MB end of October.

Now if we peer into the file content we'll can find a specific point that devides the files between product list relation ship and legal (text) content. Here's a view of this point from the latest pl.xml file (click to view a full size version):

Symc_SMP_pl_xml_SplitPlease.png

Now I have highlighted the section starting line 21,784 down to the very end of the file (209,349 lines lower). This is where the product definition and relationship really stops and where translation and legal wording begin. As you can see it take the largest part of the xml file: over 200K lines and 15 million bytes of the 17MB xml file.

Now wouldn't it be a good idea to split the files into smaller chunk, each downloadable as and when required? For example, with 2MB of pl.xml, and say 32 MB of english string most users would only need to download 4MB of data to get Symantec Installation Manager started. With 2 languages (English plus any user specific locale) this would still not be anything above the 6 MB mark.

Would this be more efficient for everyone? Less download, less web time and more flexibilty? It must also be terribly complicated for Symantec developers to work with a single 17MB file ;).

Is also make one wonder how big will the pl.xml file be after 2 years given the current growth rate...