A short comparison between Python3 and NodeJS

  1. Summary
  2. Technical Explanation
  3. Test-Case Result

Summary

I wanted to analyze some chat-logs I got from a twitch stream (with permission from the streamer).
The chat messages are all saved in files named after the day they were sent, and all of these files are in one single folder. I built a short script that reads through all of them, line by line, splits up the words that the user wrote and counted how many times a specific word was said (by any user). So basically "word appearences" of each word over all the days he/she was streaming.
I first wrote it in Python3, but then I wondered how NodeJS, the tool with which I capture the chat messages, would compare to it.
TL;DR: Both have approximately the same speed. ( 8.4 seconds for 74 files with 2 million lines, +/- 100ms )

Technical Explanation

Both the NodeJS and Python3 program read each file in, split the lines, then split the words that the user wrote and just add them to a dictionary.
The index of that dictionary is the word and the value of it the number of appearences. To sort this dictionary I just add the index and value, as a list, to a list, and sort it by the value aka. the second item of the list.
To further elaborate this:

File-Lines: 19:37 - UserName - Chat Message Chat
Dictionary: stats["Message"] = 1
Sort-List: list = [["Message",1],["Chat",2]]

Then the List will get sorted by the second value (The count of the word appearence) and written as a JSON string to a file.

Test-Case Result

This particular test-case was made from 74 days worth of chat logs, which range from 20KB to 4MB in size.
This accumulated to over 2 Million lines of chat messages.
2 Million lines read in, split into words, counted in a dictionary, put into a list, sorted the list and dumped it as a .json.
The test result:


[user@home testcase]$ time python3 speedtest.py streamername
Starting to count...
Finished counting.
Appending...
Finished appending.
Starting the sorting...
Sorting finished.
Starting to write to files...
Writing finished.
Files: 74
Lines: 2023748

real    0m8.338s
user    0m7.773s
sys     0m0.545s
[user@home testcase]$ 
[user@home testcase]$ time node speedtest.js streamername
Starting to count...
Finished counting.
Appending...
Finished appending.
Starting the sorting...
Sorting finished.
Starting to write to files...
Writing finished.
Files: 74
Lines: 2023748

real    0m8.468s
user    0m8.045s
sys     0m0.446s
[user@home testcase]$

The more often I ran this, the closer both got to an average of 8.4 seconds.
Ofcourse it's easier to write such things in Python, but for me it is good to know that I do not get any speed penalty for using NodeJS.