1 of 1 people found this helpful
You can see my approach in the attached workbook. First, I created a calculated field called 'Date Time' to convert the string field 'Start Time' into a date time format. Then I created the 'Minute' and 'Date Time Rounded' calculated fields to round to the nearest hour - if the start time was within 10 minutes of the hour. For example 9:59 is rounded to 10:00 and 10:07 is also rounded to 10:00. I then created the 'Time Rounded' calculated field to convert the time back to a string that can be used to group similar times.
Please let me know if you have any questions about the calculations in the workbook.
Many thanks for suggesting the solutions!
I have tried to apply your solution to my dataset but I encounter some difficulties to make it works. I am afraid this is because I have somehow oversimplified the actual condition of my case, especially after I added one more layer to my viewership numbers now by breaking them down to live viewing (viewing at the same time when the TV program is being transmitted) and 7-day total playback (record and watch afterwards within 7 days after the live airing).
Appreciate it if you could check out and attempt again on the actual dataset I attached here.
Hi Mei Tzy,
The real problem here is that your dimension used for ranking
is combined from the Channel, Title and the actual Start Time.
If the latter vary from the nominal (aka Planned) Start Time --
which is not in the datasource as a distinct field --
then we have no clue about how to group Titles together.
Having such a field in your datasource --
whether it is an integer like [Slot ID]
or a string like the [Start Time] --
is absolutely necessary to perform
the kind of analysis you're asking for.
Thank you for your prompt reply.
In this case, is it possible for us to first modify the "actual fluctuating start time" to "consistent planned start time" then use the latter for program ranking?
In my previous version of top programs workbook which I did not breakdown the viewership into live and playback, Channel and Start Time were separated from Title. They were all independent dimensions. I combined them together now in order for the ranking with viewership breakdown to work correctly (maybe there is other better way to achieve the same result? please feel free to suggest).
Attached is the previous version top programs workbook with Channel, Start Time and Title as independent dimensions. Appreciate it if you could take a look.
Meanwhile, I will check if I do have Planned Start Time in the raw database, and I will also try to apply yours and Sarah's solution using the previous version top programs workbook.
Thank you & have a great weekend ahead!
2 of 2 people found this helpful
Hi Mei Tzy,
I really like how you've combined
Channels & Titles in your current datafile.
To cope with the Start Date problem,
I suggest to calculate [Slot Time] by rounding
[Start Time] to the nearest 10 / 15 / 30 minute.
With that there is no need in my previous calculations --
just use the [Slot Time] instead of [Start Time] in groupings.
Please find the attached.
Hope it helps.
PS I doubt that averaging [Rtg(000)] is the right way to go.
There is no data in the datasource regarding Sample Size(s) --
ie the actual # of ratings given to the particular Channel / Title / Slot
to come to the resulting [Rtg(000)] value for a particular row.
Thus there is no way to calculate weighted metrics --
neither weighted average nor sample size based on it.
Dear Yuriy Fal,
Sorry for replaying this late.
I am happy to share with you what I have done using your solution (as attached).
By the way, I would also like to thank you for your interest in viewership numbers.
Here is some brief info about it for your reference:
- The TV program viewership tracking system Astro uses is named Instar that provides Dynamic Television Audience Measurement (DTAM) data, developed by Kantar Media (website: http://www.kantarmedia.com/global).
- At the moment, DTAM data is based on sampling. Generally, in 2016, we are using 2.5k Chinese households to project to 2.5mil individual Chinese viewer universe. This household-to-individual projection is based on Kantar Media’s Probability of Individuals Viewing (PIV) algorithm.
Rtg(000) is the average number of viewers per minutes over the airing duration (normally we use average Rtg(000) to evaluate programs performance):
Average Viewers per minute
5 viewers/ 3 minutes = Average 1.67 viewers per minute
And the sample and universe of the data source for the attached workbook as below:
DTAM Universe (000)
Once again thank you for your time and efforts in helping me to overcome this start time grouping challenge!
Take care and all the best to you!
Hi Mei Tzy,
Thank you very much for the great follow-up.
It takes me enough food to rest-n-digest.
One such a "thought from my gut"
is about Rtg(000) values (levels)
when comparing Titles to each other.
I feel that Rtg(000) values of say 90 and 50
are the different beasts (statistically),
whereas 60 and 50 are not.
Couldn't prove it here, though --
not sure about the right statistics
for panel measurements like this one.
I guess the "different beasts" you mentioned might be due to the programs actually belong to different genre category? Whereas programs with closer ratings might come from same genre category. Possible?
Anyway, feel free to let me know if you have further questions.
Hi Mei Tzy,
I'd like to say that 50 (avg per minute) viewers --
which is ~2% of a whole panel of 2500 ones --
would be (statistically significantly) different than 100,
though 50 and 60 would be not.
This solution was brilliant Yuriy Fal! Thank You for that.
I have a similar question, but I was not able to apply the same method on my sample. I believe due to the various calculation necessary on my data to get to a time slot.
Could you help me? I post the sample on a separate conversation. I appreciate if you could give me a hint how to sort it out my puzzle. See below the link:
Thank you so much