You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thx again. Great library. As you know I like to read and do "stuff" with large files. I would like to read and optionally have progress. But what would be really nice is if the file-reader could notify ie. via an event about parsing individual games as they complete.
I tried stuff with that million-game (about 2.2M games) db, I mentioned in the large-file request. Did soimething in the line of:
[Serializable]
public struct PGNGame
{
public int White { get; set; }
public int Black { get; set; }
public GameResult Result { get; set; }
}
class Program
{
static void Main(string[] args)
{
int gamesRead = 0;
int maxPlayerId = 0;
int maxGameId = 0;
ConcurrentDictionary<int, string> playerBase =
new ConcurrentDictionary<int, string>();
ConcurrentDictionary<int, PGNGame> gameBase =
new ConcurrentDictionary<int, PGNGame>();
var reader = new PgnReader();
var start = DateTime.Now;
var parsedGames = new BlockingCollection<Game>();
var queue = new BlockingCollection<List<string>>();
Task.Run(() =>
{
foreach (var gameData in queue.GetConsumingEnumerable())
{
var data = gameData.Aggregate((x, y) => x + y);
var games = reader.ReadFromString(data);
foreach (var game in games.Games)
parsedGames.Add(game);
if (parsedGames.Count > 100000)
Thread.Sleep(500);
}
});
Task.Run(() =>
{
foreach (var parsedGame in parsedGames.GetConsumingEnumerable())
{
Task.Run(() =>
{
var white = parsedGame.WhitePlayer;
var black = parsedGame.BlackPlayer;
if (!playerBase.Values.Any(name => name == white))
{
playerBase[Interlocked.Increment(ref maxPlayerId)] = white;
}
if (!playerBase.Values.Any(name => name == black))
{
playerBase[Interlocked.Increment(ref maxPlayerId)] = black;
}
var whiteId = playerBase.First(kvp => kvp.Value == white).Key;
var blackId = playerBase.First(kvp => kvp.Value == black).Key;
gameBase[Interlocked.Increment(ref maxGameId)] = new PGNGame
{
White = whiteId,
Black = blackId,
Result = parsedGame.Result
};
if (maxGameId % 100 == 0)
{
var now = DateTime.Now;
var secs = (now - start).TotalSeconds;
var speed = maxGameId / secs;
var estimated = 2200000 / speed;
Console.WriteLine("Games read: {0}", gamesRead);
Console.WriteLine("Queue length: {0}/{1}", queue.Count, parsedGames.Count);
Console.WriteLine("#Players: {0}. #Games: {1}. Speed: {2}. Estimated duration: {3}.",
maxPlayerId, maxGameId, speed, estimated);
}
});
}
});
using (var streamReader = new StreamReader("millionbase-2.22.pgn"))
{
var pgn = new List<string>();
while (!streamReader.EndOfStream)
{
var line = streamReader.ReadLine().Trim();
var isNewGame = line.StartsWith("[Event");
if (isNewGame && pgn.Any())
{
queue.Add(pgn);
Interlocked.Increment(ref gamesRead);
pgn = new List<string>();
if (queue.Count > 10000)
Thread.Sleep(500);
}
pgn.Add(line);
}
}
}
}
And am able to keep speeds of 300-450 games per sec. As you can see I use some producer-consumer scheme and "throttle" in order to keep the blocking queues managable. But I cheat... The line:
var isNewGame = line.StartsWith("[Event");
is because I know, that for that pgn-file every game is decently formatted. And as we all know, that is not the case with the rather loose PGN input format.
Is this a feasible thing to get?
Very much appreciated anyways,
Thanks for the good work,
Michael
The text was updated successfully, but these errors were encountered:
Hi,
Thx again. Great library. As you know I like to read and do "stuff" with large files. I would like to read and optionally have progress. But what would be really nice is if the file-reader could notify ie. via an event about parsing individual games as they complete.
I tried stuff with that million-game (about 2.2M games) db, I mentioned in the large-file request. Did soimething in the line of:
[Serializable]
public struct PGNGame
{
public int White { get; set; }
public int Black { get; set; }
public GameResult Result { get; set; }
}
class Program
{
static void Main(string[] args)
{
int gamesRead = 0;
int maxPlayerId = 0;
int maxGameId = 0;
ConcurrentDictionary<int, string> playerBase =
new ConcurrentDictionary<int, string>();
ConcurrentDictionary<int, PGNGame> gameBase =
new ConcurrentDictionary<int, PGNGame>();
And am able to keep speeds of 300-450 games per sec. As you can see I use some producer-consumer scheme and "throttle" in order to keep the blocking queues managable. But I cheat... The line:
var isNewGame = line.StartsWith("[Event");
is because I know, that for that pgn-file every game is decently formatted. And as we all know, that is not the case with the rather loose PGN input format.
Is this a feasible thing to get?
Very much appreciated anyways,
Thanks for the good work,
Michael
The text was updated successfully, but these errors were encountered: