How to determine processed database table rows of a query?
Question in short: how can I get a list of row IDs of rows processed when
a query is processed?
Edit note: I am not looking for returned rows. When a user has 5 posts on
Facebook and I do a '''SELECT * FROM posts WHERE user=Mark ORDER BY date
desc LIMIT 1''' I know, that the returned number will be 1, but I'd like
to know how many rows have been processed (in this case without indices,
probably all rows). And I am primarily looking at SELECT statements.
Hello,
I am currently working on a project that aims in the direction of data
aging. I.e., we are trying to determine which tuples are accessed
regularly and which are not. We've got a decent workload (i.e., a query
log of the system) with the corresponding data and would like to know,
what rows have been processed.
Besides the question, what rows we're also interested in what attributes,
but that can be done parsing the query (projection, join attributes, and
where conditions). Leaving the question open how to get the actually
processed rows.
We are aware that many queries will (let's assume there are no indices)
process all rows, because there is a where-condition that requires a full
table scan. We are aware of that problem, but still wan't to find out,
which rows have been accessed.
My final question is now: How can we achieve that?
I have been looking into MySQL and Postgres but could not find sufficient
information (e.g., MySQL's 'explain' just returns an estimate for the
number of rows processed, but not any row IDs). I am guess that we will
have to modify the source code of a DB to achieve that kind of logging
(performance of that logging is not an issue, it's offline analytics). Has
anybody any recommendations how to achieve that/done that?
No comments:
Post a Comment