Statistical Analysis With Java and PokerFTP

I've been trying to generate some VPIP distribution charts using the data from PokerFTP.

It's all done in Java, which is a nice language, but unfortunately I keep on hitting memory problems due to the amount of data that I need to manipulate.

I am considering filtering my hand processing to exlude players with under 100 hands. That will remove players for which I don't have a non trivial amount of data. However, it will also add survivosrhip bias to my results, as it will remove players who are so bad that they lose their bankrolls within 100 hands.

To test whether or not there is significant survivorship bias from player with under 100 hands, I need to analyse the win rates for these players, and see if contains a larger number of losers when compared to the players for which I have a larger number of hands.