Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.
Endpoint Management Community Blog

{CWoC} A Good Knock on the Performance Issue, But It Did Not Come from Any Indexing Yet

Created: 30 Dec 2009 • Updated: 30 Dec 2009
Ludovic Ferre's picture
0 0 Votes
Login to vote

In my previous blog post (here) and self-reply I was searching for a solution to the negative impact on performance a large cache had.

It included some really wacky ideas (the front and rear indexing for example) and some better one (storing ip's in their original form i.e. 32-bit unsigned integers). I spent much time thinking about the best possible indexing (considering tree implementations, sorted list, as well as block memory allocation to limit the hits on mallocs) and after a walk this morning (I went to the weekly street market refill my fresh basket) I thought that performance was prime, thus I decided to revert the cache store to the basic array used originally.

With that decided I knew I would gain some performance over the linked list implementation, but I wanted to gain from the experience and to implement some mechanism in order to reduce the overall time-to-delivery of a parsed file result.

So with the single array I took the front~rear idea in to implement a kinda-sorted list: guids are stacked from the array head, and ip addresses are stacked from the array tail. Also I made the move to store ip addresses in native form, converting the dotted decimal presentation to a 32-bit int before storing the value in-lieu of a pointer in the array. This saves again on the malloc system calls, as well as on the processing time.

A final change that was made possible with the 2 dimension array (did I specify this before: 1 dimension for the data or data pointer and the other dimension for the hit counter) was to drop the string cache entry structure (sce) so the processing is simpler and there's less operations in the current version of the code (I need to review my plans, but I think 0.2.3 is going to make the next tag :D).

Finally, the implemented two-dim array with the hit counters gives us a good option to improve performance and possibly sort the array to some extent. I'm wondering how much improvement sorting moving the top hitters to the bottom of the stack so stack search would find them quicker.

Anyway we will see, but I suspect I can spare some cycles to this, as a big (cache) hitter located near the top of the stack when 2,000 entries are present could rapidly reach the million of search cycles wasted (only a 1,000 hit would be needed to waste a million cycles with the cache located at position 1,000).

Finally, some data on the current version of aila (on my 32-bit Xubuntu):

ludovic@xubuntu-laptop:~/dev/altiris-ns-tooling/aila$ time ./aila-linux-0.2.2-long --file ../DATA/ex091207.log > /dev/null

real    0m30.784s
user    0m30.282s
sys     0m0.484s

ludovic@xubuntu-laptop:~/dev/altiris-ns-tooling/aila$ time ./aila-linux-0.2.2-long --file ../DATA/ex091207.log > /dev/null

real    0m31.890s
user    0m31.274s
sys     0m0.620s

ludovic@xubuntu-laptop:~/dev/altiris-ns-tooling/aila$ time ./aila-linux-0.2.2-int --file ../DATA/ex091207.log > /dev/null

real    0m31.879s
user    0m31.278s
sys     0m0.600s

ludovic@xubuntu-laptop:~/dev/altiris-ns-tooling/aila$ time ./aila-linux-0.2.2-long --file ../DATA/ex091207.log > /dev/null

real    0m33.422s
user    0m32.594s
sys     0m0.780s

And here's the full output (no cache dump available for public view yet):

Program read 173584167 bytes from 988232 lines Mime type analysis summary results:
	File type= htm , page hits= 5660
	File type= js  , page hits= 1008
	File type= css , page hits= 438
	File type= aspx, page hits= 539084
	File type= asmx, page hits= 31500
	File type= other, page hits= 410534

Altiris Agent request analysis summary results:
	Agent request= Reg Client, page hits= 113
	Agent request= Get Policies, page hits= 43821
	Agent request= Get Pkg Info, page hits= 31944
	Agent request= Get Snapshot, page hits= 459178
	Agent request= Post Event , page hits= 296685
	Agent request= Other, page hits= 156483

IIS Web-applications analysis summary results:
	Webapp= /Altiris/NS/Agent/, dir hits = 832048
	Webapp= /Altiris/NS/NSCap/, dir hits = 127
	Webapp= /Altiris/NS/, dir hits = 32722
	Webapp= /Altiris/Resource/, dir hits = 196
	Webapp= /Altiris/IRA[1]/, dir hits = 10041
	Webapp= Others, dir hits = 113090

[1] IRA is an abbreviation of InventoryRuleManagement/Agent

Detailed IIS status code analysis results:
	IIS Status code= Success (1xx,2xx), hits count = 972126
	IIS Status code= Redirected (3xx), hits count = 15465
	IIS Status code= Client error (4xx), hits count = 621
	IIS Status code= Server error (5xx), hits count = 12

Detailed IIS status code analysis results:
	Sub Status code= 0, hits count = 988223
	Sub Status code= 9, hits count = 1

Detailed IIS Win32 status code analysis results:
	Win32 Status code= Win32 Success, hits count = 967105
	Win32 Status code= Win32 Failure > 0, hits count = 21119

24 hour hit counters:
	Hits counted during hour  0 to  1  was 27845
	Hits counted during hour  1 to  2  was 23114
	Hits counted during hour  2 to  3  was 27810
	Hits counted during hour  3 to  4  was 20752
	Hits counted during hour  4 to  5  was 24887
	Hits counted during hour  5 to  6  was 20665
	Hits counted during hour  6 to  7  was 29071
	Hits counted during hour  7 to  8  was 44988
	Hits counted during hour  8 to  9  was 82585
	Hits counted during hour  9 to 10  was 80390
	Hits counted during hour 10 to 11  was 62526
	Hits counted during hour 11 to 12  was 65265
	Hits counted during hour 12 to 13  was 85627
	Hits counted during hour 13 to 14  was 61712
	Hits counted during hour 14 to 15  was 39595
	Hits counted during hour 15 to 16  was 48961
	Hits counted during hour 16 to 17  was 42308
	Hits counted during hour 17 to 18  was 29446
	Hits counted during hour 18 to 19  was 29471
	Hits counted during hour 19 to 20  was 24499
	Hits counted during hour 20 to 21  was 26200
	Hits counted during hour 21 to 22  was 20509
	Hits counted during hour 22 to 23  was 31691
	Hits counted during hour 23 to 24  was 38307

Brought to you by {Connect Winter of Code}