Hi I'm pretty new to Tableau and I will try to explain you all.
I have a JSON file with a data schema:
- Level 1, each entry is a paragraph of a long text (book).
- Level 2 (of each paragraph) can contain the comments of the users to that paragraph (each comment has a sub level containing the user’s info)
- Leve 3 (of each comment) can contain the answers of the users to that comment (each answer has a sub level containing the user’s info)
I cannot change how data are stored in the json, I just receive it and I have to visualize data.
Both comments and answers are considered “twylls” (a funny name like tweet).
I need to create a report to show to the clients how much impact had the project/game on that book.
So, I try to explain my problems:
- In a sheet I can visualize the Number of Record for comments and the Number of records for answers. Anyway, if I put these values into 2 parameters in order to use them in the dashboard (the report I will provide), those values are ZERO (even if the two measures in the sheet display a real value.
- I simply want to create sheet with a pie chart containing the Number of Record for comments and the Number of records for answers, the total number (as caption or title), the percentage, ecc. It looks impossible to do such a siple task (show a pie chart containing 2 numbers) because it wants a dimension (they are simply numbers, what dimensioni s needed?).
- Not being able to use parameters, I cannot even do calculations and show them in the dashboard. Simple example: I have 160 comments and 40 answers, I would like to show that people wrote 80% of comments and 20% of answers, that the conversion rate of each commenti is 25% (40 / 160 * 100) so that 1 answer every 4 comments.
- I would like to show how many unique users played. Users are stored as sub level of comments and as sub level of answers, so I can count unique ids in those two levels, but this way I cannot just make a sum because maybe I will have duplicates. I need to make a union of the record of those two levels (lists, or what they are), then count unique values. After that, I would like to say the average of twylls per user (es 50 unique users: I have 4 twylls per user).
- Comments and answers have a timestamp. I already succeed in converting it to a date. But I cannot show what is the day of the first comment (or answer) and the last written, since they are in two different levels. I need to found the first and last, to create a parameter that tells how many days the game lasted. And tell the average of twylls per day.
- Each twyll has a filed called content, containing the text written by the user but stored encoded (I mean, accented letters are ascii encoded and not unicode), so I need first to decode them (I want the “outer” text, as the user wrote it) and then calculate the length in chars per twyll. So that I can later tell the average length of the twylls, the comments and the answers.
- I would like to create a table or a graphic in which client that read the report can type a list of users (nicknames) and can see some analytics only for them. Example: alicetw, bianconiglio, ilcappellaio, thequeen à
Nickname: alicetw | Twylls: 113 | Answers: 42 | Conversion: 37% (+15% rispetto alla media)
Nickname: bianconiglio | Twylls: 58 | Answers: 16 | Conversion: 28% (+5% rispetto alla media)
Nickname: thequeen | Twylls: 47 | Answers: 8 | Conversion: 17% (-6% rispetto alla media)
Nickname: ilcappellaio | Twylls: 5 | Answers: 2 | Conversion: 40% (+17% rispetto alla media)
- I would like to show the top five (or top 10, or top 3, I want to link the number to a slider or something else that client can change live) of the best twylls (comments with most answers), showing content, author, number of answers ecc
- I would like to scan all the contents written, find only the words with hashtag and display a word cloud of the most used hashtags.
- I want to display a word cloud of the most used words, excluding hashtags, excluding words shorter than an amount of characters, excluding a custom list of words (stopwords).