Trillian/ICQ/MSN Instant Messaging Log Merger by zAlbee

IMmerge 1.05 – Improved Timestamps, Group Chats

December 18, 2011

IMmerge 1.05 is now available for download. This release further improves the accuracy of parsing plain text logs, which is useful when converting to an XML format for Trillian’s Activity History Viewer. It also adds the option of specifying a custom timestamp, in case you don’t use the default [hh:mm] timestamp given by Trillian (or its variants with AM/PM and/or seconds). Lastly, it fixes several parsing issues with group chats and some other bugs.

Improved Message Recognition Using Timestamps

Say you are having a conversation with a buddy, and in that conversation, you copy/paste a transcript of a previous chat. Well, if the chat is saved in a plain-text format, it may be difficult to tell the difference. Here’s an example:

Session Start (User:User2): Mon Nov 07 06:00:00 2011
[06:00] User2: Hello.
[06:01] User: Hi.
[06:01] User: Can you repeat back the beginning of this chat to me?
[06:02] User2: OK. Now I will copy/paste the previous messages.
[06:00] User2: Hello.
[06:01] User: Hi.
[06:01] User: Can you repeat back the beginning of this chat to me?
[06:03] User: Thank you!
Session Close (User2): Mon Nov 07 07:00:00 2011

This transcript looks like there are 8 total messages at first glance, but there are actually only 5 true messages. User2 has copy/pasted the first 3 messages and included them all in the 4th message (spanning 4 lines). Even a human has to look closely to figure this out, so what hope does a machine have? Well, if the log is timestamped, we can infer a copy/paste happened because the chat seemingly went backwards in time between message #4 [06:02] and message #5 [06:00], which would be impossible. (Thanks to Stefan M. for pointing this out.) This logic is now built into IMmerge 1.05, and here is the result of applying it to the above text log:

Converted chat log

As you can see, IMmerge correctly identifies which message was sent by which user at what time, including the part that was copy/pasted! Of course, this is not 100% foolproof. There are valid cases where a timestamp is not really going “backwards”, but is actually seen because the chat went past midnight and we have moved onto a new day. Fortunately, IMmerge already handles this, but previous versions actually applied it too often. Version 1.05 is much improved and will first eliminate all impossible options before making a decision, or prompting the user to choose if the choice is ambiguous.

Thanks to all the people that reported bugs and helped test. As always, please feel free to email me with any suggestions or concerns!


Filed under: Algorithms,IMmerge v1,Release | No Tag
No Tag
December 18th, 2011 01:10:51