stiu ca am mai postat-o si pe RONUA, dar o mai scriu si aici
Citat complet de la
http://blogs.msdn.com/rjacobs/archive/2006/09/25/770451.aspx
Here is the scenario…
Every day you get a text file with a number of financial records
in a fixed length format. Your job is to identify any new or changed
records from the previous day and be able to produce an output file in
the same format containing only the new or changed records. You can
make no assumptions about the ordering of records in the file.
Here is a sample of the data
1310|512|086048610|01/01/1996|WB| |12/31/9999|1290.00 |USD5 |
1310|512|110000011|06/10/2002|WB| |12/31/9999|100.00 |USD5 |
1310|512|110000111|06/10/2002|WB| |12/31/9999|100.00 |USD5 |
The data files can get quite large (3GB)
Question for you: How would you architect the solution for this to achieve optimal performance?
Raspunsurile la aceeasi adresa (dar preferabil de gindit inainte):
http://blogs.msdn.com/rjacobs/archive/2006/09/25/770451.aspx
Hmm, cu clasicele comenzi Unix `sort` si `diff`? 😀 Pe cazul general pe care l-ai prezentat, mi se pare as-efficient-as-you-can-get.
e super cum toti cei de pe Unix se gindesc la asta… iar eu, ca programator de BD, la Baze de date…si sa las munca pe altii