I am having a table called discussions. it contains all the topics as well as comments.
I want to fetch all the topics and their latest comments at once. if there is no comment for a topic, then topic is fetched or else the latest comment.
for example, if there are 19 topics, and 15 topics have comments, then the latest comments of those 15 topics and the remaining 4 un-commented topics are fetched.
I am using multiple sub-queries to fetch them at once. like the one below.
select * from discussions where id in (select id from discussions where parent_discussion is not null group by parent_discussion desc) or id in (select id from discussions where parent_discuss
ion is null and id not in (select distinct parent_discussion from discussions where parent_discussion is not null)) order by created_on desc;
here the first sub-query fetches all the latest comments and the second one fetches all the topics that have no comments.
here's my question to you guys. "How much efficient is this query ? do you have any other way around." Your suggestion would be of a great help. thanks in advance.
This is not an efficient query. I also think it doesn't do what you want. From your description of the requirements for the query, you want to select all the discussions plus some other data from another table.
Let's break down your query:
This returns the ids of all discussions with comments.
This returns the ids of all discussions without comments
This turns the ids of all discussions that that are not in the set of discussions with comments. In other words that don't have comments. However, the previous subquery already does that so this is redundant.
This returns all columns from all rows with matching ids. Since your query amounts to "all with comments or all without comments", this returns the whole table. So you could have just written select * from discussions.
It's difficult to talk about performance until you get the functionality right so maybe you can clarify the purpose of the subqueries?
by the way, the query I posted above fulfills my requirement. I assume you had trouble understanding my requirement.
let me break down the query.
This line returns all the records from the discussions table. where.. the condition goes...
whose id includes those records, which have parent discussion(i.e. these are comments). group by desc ensures that only the latest comment per discussion is fetched. and this sub-query fetches only the comments.
the 'OR' specifies that, record can be either a topic or a discussion.
This sub-query fetches all the records, that are topics. (i.e they do not have parent). and also ensures that they do not include any of those topics that are having comments and were fetched in the previous sub-query.
This line sorts all the records by their posting/commenting date with the latest on top.
If you are still not sure about this, take a look at the list of records after I ran this query.
We have 19 topics here, and only 15 topics have comments.
Let me remind you that i am using a single table `discussions` to store both topics as well as their comments.
This `discussions` table contains about 190 records. out of which, 19 records are topics(they do not have parent_discussion). remaining 171 records are comments. we are leaving out all the old comments and fetching only the latest comment per discussion.(the group by clause). and topics themselves if they do not have comments.
Jeanne's right, you are describing one table to hold topics and comments and this sounds a bit like single table anti-pattern. Add more tables, normalize the data, and the structure will make querying it a lot easier.