Forged Alliance Forever Forged Alliance Forever Forums 2014-10-04T09:55:07+02:00 /feed.php?f=45&t=7060 2014-10-04T09:55:07+02:00 2014-10-04T09:55:07+02:00 /viewtopic.php?t=7060&p=82642#p82642 <![CDATA[Re: Replay System]]>
1) abandon replay_vault. it may be a bit quicker, but its gonna be a mainenance nightmare when new search
features have to work with 2 table formats. replay_vault format for every last replay would be a waste of
HD space.
2) game_stats.id and game_player_stats.id should be 32bit unsigned instead of 64bit signed integers
since both of them are below 10M in 2 years, many decades will pass before they overrun
primary keys are added to each secondary index row which makes it especially important to make them
as small as possible
3) use partitions (http://dev.mysql.com/doc/refman/5.5/en/ ... oning.html) with a condition on
the a date expression (e.g. 3 month per partition). it is pretty much clear that a search through the whole
replay vault is not gonna happen anyway, but rather multiple independent searches in progressing timespans.
this scheme implies that most searches in the most recent replays must search 2 partitions, but ancient
replays are kept away from current searches (they have there own pages and own indices). i guess the tradeoff
is debatable
4) game_player_stats.{mean,deviation,after_mean,after_deviation} need not be floats but can be 16bit signed
5) game_stats.endTime can be uint8 or uint16 (minutes relative to starttime). game time with minutes resolution
is sufficient.
6) the server keeps track of running games in a engine=memory table and moves finished games into game_stats to
reduce write load on game_stats. the memory table has the same format as game_stats and game_player_stats.
or only uses server (python/c++) data structures until the game is finished, whatever makes more sense.

Statistics: Posted by rootbeer23 — 04 Oct 2014, 09:55


]]>
2014-10-03T10:09:06+02:00 2014-10-03T10:09:06+02:00 /viewtopic.php?t=7060&p=82576#p82576 <![CDATA[Re: Replay System]]> I won't do it myself anymore, but feel free to give it a try.

Statistics: Posted by Ze_PilOt — 03 Oct 2014, 10:09


]]>
2014-10-03T02:04:58+02:00 2014-10-03T02:04:58+02:00 /viewtopic.php?t=7060&p=82545#p82545 <![CDATA[Re: Replay System]]> Statistics: Posted by RoLa — 03 Oct 2014, 02:04


]]>
2014-10-01T17:26:54+02:00 2014-10-01T17:26:54+02:00 /viewtopic.php?t=7060&p=82412#p82412 <![CDATA[Re: Replay System]]>
It's crude (some info are disabled,...) but it works.
As it is, it's enabled to check the last command before a replay desynch. But the whole parser is there.

What is (mainly) missing is drawing chart (matplotlib is already inside the lobby).

Statistics: Posted by Ze_PilOt — 01 Oct 2014, 17:26


]]>
2014-10-01T17:06:03+02:00 2014-10-01T17:06:03+02:00 /viewtopic.php?t=7060&p=82410#p82410 <![CDATA[Re: Replay System]]> Statistics: Posted by RoLa — 01 Oct 2014, 17:06


]]>
2014-10-01T16:19:31+02:00 2014-10-01T16:19:31+02:00 /viewtopic.php?t=7060&p=82406#p82406 <![CDATA[Re: Replay System]]>
Sheeo wrote:
It also allows us to look into replays which are older than 3 months, and perhaps there are more low-hanging optimizations we can do (I only spent little time so far looking at the query).


Having a replay_vault table that does not delete older replay would allow it too.

I agree that optimizing the queries would be better, but it's always a question of time and priorities..

Statistics: Posted by Ze_PilOt — 01 Oct 2014, 16:19


]]>
2014-10-01T16:15:33+02:00 2014-10-01T16:15:33+02:00 /viewtopic.php?t=7060&p=82404#p82404 <![CDATA[Re: Replay System]]>
Ze_PilOt wrote:
Yes and no.

ie. if you say "let take only the 3 last months", and search for rating 2000+ replays on theta, it's unlikely you will fill 100 replays (what is currently returned to the lobby).

So to allow to see more, you would need a "see more.." option in the lobby, and pagination support.

I hate to repeat myself, but it's really the core of the problem : the lobby needs works for display/query.


Certainly the lobby should support better querying. I'm sure thygrrr is willing to help out here, or someone else if he's too busy :)

I do think it's relevant to see if we can actually answer these queries in reasonable time without the denormalized setup, however. It also allows us to look into replays which are older than 3 months, and perhaps there are more low-hanging optimizations we can do (I only spent little time so far looking at the query).

Statistics: Posted by Sheeo — 01 Oct 2014, 16:15


]]>
2014-10-01T16:11:08+02:00 2014-10-01T16:11:08+02:00 /viewtopic.php?t=7060&p=82403#p82403 <![CDATA[Re: Replay System]]> Statistics: Posted by rootbeer23 — 01 Oct 2014, 16:11


]]>
2014-10-01T16:07:52+02:00 2014-10-01T16:07:52+02:00 /viewtopic.php?t=7060&p=82402#p82402 <![CDATA[Re: Replay System]]>
Sheeo wrote:
So we have 560k replays within the last 3 months?


No, if you check the dump, there is a row for every player (so several rows for the same game).

That allows filtering the games by rating or player name (and if the lobby was able to, by faction).

Sheeo wrote:
I'm not entirely convinced that a denormalized copy of all these stats is worth the overhead; my version of the query wouldn't be run directly, since we're only ever interested in a small subset of replays (Which is what the indexes help with).


Yes and no.

ie. if you say "let take only the 3 last months", and search for rating 2000+ replays on theta, it's unlikely you will fill 100 replays (what is currently returned to the lobby).

So to allow to see more, you would need a "see more.." option in the lobby, and pagination support.

I hate to repeat myself, but it's really the core of the problem : the lobby needs works for display/query.

Statistics: Posted by Ze_PilOt — 01 Oct 2014, 16:07


]]>
2014-10-01T16:06:06+02:00 2014-10-01T16:06:06+02:00 /viewtopic.php?t=7060&p=82401#p82401 <![CDATA[Re: Replay System]]>
Ze_PilOt wrote:
Yes, data are removed if the replay is older than 3 months, because the lobby is unable to display that many replays either way.

If the lobby was able to do some complex searches and displays several pages, and with a new server having more memory, that limitation would go away (or will be less limiting).


So we have 560k replays within the last 3 months? Doesn't sum up for me, considering 54k games were played the last 30 days.

The actual data send times should be included in profiles -- if anything it reminds one what is going on.

I'm not entirely convinced that a denormalized copy of all these stats is worth the overhead; my version of the query wouldn't be run directly, since we're only ever interested in a small subset of replays (Which is what the indexes help with).

Statistics: Posted by Sheeo — 01 Oct 2014, 16:06


]]>
2014-10-01T16:04:07+02:00 2014-10-01T16:04:07+02:00 /viewtopic.php?t=7060&p=82400#p82400 <![CDATA[Re: Replay System]]>
If you had said at the start of this 'i dont have time', i would not have responded at all, but instead we went back and forth with a bunch of misdirection stuff.


So if you want to give some time to the project, I would like you to work on the lobby interface for the replay vault, so it can handle complex query and pagination.


I can appreciate that but my time is limited (I have a young baby) and I'm not going to learn new languages for this, so no lobby ui work for me. However here are new functions that I want:

- replay search gets wildcard mapname search and wildcard playername search - two things that annoy me very much.
- replay search gets more results, fast - i want to see further back
- new statistics page where you can compare two players and see stats on how much they beat each other, divided by faction, map, etc.

Although I wouldn't be doing any UI work for them, perhaps I could do the db work locally and prove on my own system that I can get these features working. Then I mockup a screenshot of what it could look like, then come on the forums and ask someone to do the UI work for me.

Some of these features may not be possible, but I would like to discover that via attempting to do it. The truth is we can't even see what is possible without the schema.


Also, you can spend time working on the DB, but it's very unlikely that your work will make it to the production server if it's unsupervised. It's not a trivial work.

Fair point, but you could come up with a mechanism for it. Eg: I prove my idea works via examples. I make change scripts and so on, server guy (you) reviews it, makes a backup, runs scripts, do some test, etc. Ideally though you would see the forum post I mentioned earlier (with mockup, looking for a UI guy) and say "yeah that's awesome I'm going to shove that right in". You know, community collaboration and all that.


It's even more true for something done in the free time of everyone involved.

You not having time to do things is a valid reason and I will not complain. But this thread has NOT been about your time.


If you refuse to work on what is necessary (determined by the project leader(s), don't except to have any kind of support from them.

The community is willing to spend time on enhancements that they are passionate about. If you had more time it would be wonderful if we could perfect and implement the community's ideas. Telling me to do some other work instead is ... not really what it's about. You can't take someone's passion and try and put it over there instead.

TLDR:

All I asked for was data and a schema. That's not a huge ask of anyone. You could have said you were busy but instead ... drama.

Statistics: Posted by nine2 — 01 Oct 2014, 16:04


]]>
2014-10-01T15:59:19+02:00 2014-10-01T15:59:19+02:00 /viewtopic.php?t=7060&p=82399#p82399 <![CDATA[Re: Replay System]]>
Sheeo wrote:
EDIT: Discounting data send time, I get roughly the same.

Since we won't be sending all the data though, I think it's more relevant to time actual queries being executed. I thought data was being deleted from the replay_vault table too?
[/quote]

I've only timed the queries each time, not displaying the data.

Yes, data are removed if the replay is older than 3 months, because the lobby is unable to display that many replays either way.

If the lobby was able to do some complex searches and displays several pages, and with a new server having more memory, that limitation would go away (or will be less limiting).

Statistics: Posted by Ze_PilOt — 01 Oct 2014, 15:59


]]>
2014-10-01T15:53:34+02:00 2014-10-01T15:53:34+02:00 /viewtopic.php?t=7060&p=82398#p82398 <![CDATA[Re: Replay System]]>
Ze_PilOt wrote:
Sheeo wrote:On my local database, selecting 540k rows is a slow process, much slower than 200 microseconds. Infact it is so on my fast server, as well.


I just did a SELECT query?

Tell me how I should do it to run it as you do?

But really, searching anything in that table is blazing fast. The only problem is how much memory is needed.


Yeah sorry, I don't know how you're getting the times without data sending directly?

Sheeo wrote:
EDIT: Discounting data send time, I get roughly the same.

Since we won't be sending all the data though, I think it's more relevant to time actual queries being executed. I thought data was being deleted from the replay_vault table too?

Statistics: Posted by Sheeo — 01 Oct 2014, 15:53


]]>
2014-10-01T15:50:25+02:00 2014-10-01T15:50:25+02:00 /viewtopic.php?t=7060&p=82397#p82397 <![CDATA[Re: Replay System]]>
Sheeo wrote:
On my local database, selecting 540k rows is a slow process, much slower than 200 microseconds. Infact it is so on my fast server, as well.


I just did a SELECT query? Most likely everything is cached as that table is interrogated all the time. Disabling caching never really work in my tests.

Tell me how I should do it to run it as you do?

But really, searching anything in that table is blazing fast. The only problem is how much memory is needed.

Statistics: Posted by Ze_PilOt — 01 Oct 2014, 15:50


]]>
2014-10-01T15:52:17+02:00 2014-10-01T15:41:15+02:00 /viewtopic.php?t=7060&p=82396#p82396 <![CDATA[Re: Replay System]]>
Ze_PilOt wrote:
Sheeo wrote:
Ze_PilOt wrote:Using replay_vault table:

538,528 total, Query took 0.0002 sec


Can I see the query you used here?


Ah, for that one, a simple select, as it's basically the same result than the complex query you've optimized.


Okay, I have trouble reproducing that. How did you test?

On my local database, selecting 540k rows is a slow process, much slower than 200 microseconds. Infact it is so on my fast server, as well.

EDIT: Discounting data send time, I get roughly the same.

Since we won't be sending all the data though, I think it's more relevant to time actual queries being executed. I thought data was being deleted from the replay_vault table too?

Statistics: Posted by Sheeo — 01 Oct 2014, 15:41


]]>