you are right, I'm using a wide table.
I think I cannot benefit because the information I have are basically in one field under this structure:
rules tripped 5237260000000467926:90;5237260000000612092:150;5237260000000683674:500;5237260000000683673:800;5237260000000582213:100;5237260000000468663:10; 5237260000000467919:40;5237260000000467387:50;5237260000000467386:50;5237260000000467925:380;5237260000000468661:40;
So I created a table with a column for all the potential rule IDs and populated with how many times the rules has been seen.
Do you think is there anyway I can analyse it directly in tableau without this manipulation?
There's not enough context based on those strings above for me to provide you with decent advice.
I don't know what piece is or what it signifies.
Generally I would say this - it looks like you need to preprocess your data before you put it into tableau. If you have a better formed file with a list of ID's, you could do a lookup to an aliases file in order to automatically map an ID to a friendly name.
Without seeing some data or a better broken down example, I won't be able to provide anything more specific than that though....
The reason why I cannot pivot it's because the field cannot be counted or summed. It's just a string where I can extract the IDs.
Thanks for your suggestion anyway,
And based on that I'd say you'd be better off pre-processing these strings prior to tableau so you end up with one string per row in some sort of an iterative script.
And this is exactly what I have done. That's why the table is more wide than deep.
Pre-processing doesn't mean it has to be wide versus deep. Based on my understanding I'd have one ID per row as part of a dimension like 'Observation ID', then I could cross reference (join) that observation ID to another table to get a friendly name.
The problem with having one column per ID is that it's not possible to do a lookup on the column name itself, en-mass.
If you can prepare a trimmed down example for a couple of records of your source data and how you're currently pulling those into Tableau now and attach a Tableau Packaged Workbook, we can work through a couple of options. I
Thank you so much for you kind reply.
To explain you my original problem I attach a sample of the data I have. In this sample, every row represents a unique transaction or listing. in the field "scoring7_rules_tripped" there are the rules that tripped for every listing with the related score:
example in the first transaction, the first 2 rules tripped are: 5237260000000734034:0;5237260000000616453:1 which means rule id 5237260000000734034 with score 0 and rule id 5237260000000616453 with score 1
as you can see in the sample, every transaction can have a completely different set of rules tripped.
Now, my final goal is to build a report where I can count how many time every rule id tripped (so how many transactions has a specific rule id), and I would like to rank them to know what are rules most tripped.
In order to do that, I preprocessed the table and vectorize the rules so I have every column for every possible rule id, and I populated the filed for every transaction with a 1 in case the rules was tripped. To reduce the dimension of the table I have also grouped the transactions by day, but this should change the scenario to describe my situation.
Finally for the way I've preprocessed I need to change the label for every rule id using the rule name that I can take from another table.
Do you have any suggestion on a better way to preprocess the data or a way to build the report without this preprocessing based on the structure of my data?
Thanks you so much for your reply.
tableau_sample.csv.zip 1.6 KB
Simply put, if you want to report on the values in the scoring7_rules_tripped column individually, you need to preprocess and break them out so you end up with many rows per Transaction_UUID.
Here's an example based on your first row of data;
Transaction_UUID Scoring7_Rule_Tripped Score e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000734034 0 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000616453 1 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000616453 1 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000517834 0 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000517833 1 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000457038 -10 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000461413 5 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000461414 5 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000461416 5 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000461415 5 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000461417 5 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000461418 5 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000458235 1 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000458749 30 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000458748 40 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000706335 0 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000706336 0 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000587143 0 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000459393 4290 e4d98f4d-7e53-4a29-b44c-7b378ea83074 5237260000000559434 0
If you were to pull that into Tableau now, you could report on the count of rule tripped, the sum of the score by rule tripped etc.
I will try and see the results. I'm just worried there is an impact on the performance if I multiple the rows that much.
Anyway, can I then easily replace the rules id with the rule names once I pivot the data from this format?
You can join the Rule ID to a lookup table then and pull in a friendly name. You could put them both into the same datasource and join them in the connection dialog.
I follow your suggestion and even if now I have a huge amount of data I've been able to create the report in the structure I wanted.
Now I have another question:
I created the report with the CNT(Rule) to have the measure of how many times a rule have been tripped.
But the same rule can trip multiple time for the same transaction 4d98f4d-7e53-4a29-b44c-7b378ea83074 tripped twice the rule 5237260000000616453, and I would like to have both measure:
1) how many times a rule have been tripped (done with the CNT(Rule)
2) how many distinct transaction tripped every single rule.
what's the best way to calculate the second field?
Please upload a Tableau Packaged Workbook including sample data which supports your scenario.