I suspect there are some subtleties here I don't quite understand, but it seems like you're making your life unnecessarily difficult by trying to use rawsql functions.
If you're writing Custom SQL anyway, can't you simply extend the first SELECT statement to include your date column and do something like:
SELECT id, datepart("week", Published) as week , count() OVER(.....) as quantity_created_by_user FROM articles WHERE....GROUP BY 2
Perhaps you can explain the logic in your OVER window function - why are you partitioning this way?
Never mind, I wasn't paying attention. You are partitioning because you need the user to have visited your site the previous week before you even "care" about them this week. Duh. Just about to run out for a walk, and will type more later, but you might want to consider using Table Calculations or Level of Detail Calculations for this work - they can do the partitioning you want,
Hey, any chance you can post some simple sample data in an excel worksheet? Just enough to show what the "base" records look like. There are a bunch of different ways to solve this, but I'm still not 100% clear on what the data looks like and don't want to waste my/your time by helping you solve the "wrong" problem. Let's not worry about your query for now, just show what the data looks like in Impala, and the RESULT you want to see.
Hey no problem, thanks for the help.
Here is a piece of the csv:
I can also add in user_ids and such, but that isn't really needed. Basically trying to answer "how many events were created today by returning users. I'll take a look at the docs for Table Calculations and Level of Detail calculations. Let me know if you have any tips .
- Does each record represent a week? Does your system actually associate time with these values? How?
- I don't see any of the (user) id values repeating at all, which tells me none of these are repeat users? Maybe this sample needs to enriched a little more?
Each row in the records above represents an individual occurrence of an article created. The quantity_of_articles_published_previously column is a rolling count I'll post an example below.
id, user_id, quantity_of_articles_published_previously, published
1, 1, 1, 2016-01-01
2, 2, 1, 2016-01-02
3, 1, 2, 2016-01-03
4, 1, 3, 2016-01-04
5, 1, 4, 2016-01-05
6, 2, 2, 2016-01-06
Nothing about this is weekly. I've been trying to get tableau to handle the "weekly" part. Although I could potentially handle it in query.
Does that help? Thanks!