Manipulate Dataset to Extract Values from String and Use as Unique Identifiers

Meghan_550057
Meghan_550057 Posts: 107
edited May 2020 in Questions

Looking for some guidance about how to fully accomplish this data manipulation use case. In the original data set, you'll see that the emails are embedded within a string (so there could be many emails listed for a single value in column 1). Example listed in Original:

Looking for a way to extract individual emails and put them into separate rows with their corresponding values in the next row. See Requested:

I've built a sample with batch processing, extract the row values corresponding to column 1 and then converting to text > making comma delimited > then converting back to a table with their corresponding value in column 1 included. I've gotten to the point that I can get the emails into their own rows with their corresponding column 1 values but then if the email has multiple column 1 values, I'm struggling to combine those together with commas into a single cell. The final version of what I've gotten in my process is below. As you can see, email1 appears in two rows but needs to be collapsed into one with 123,456 in column 2. Running into issues specifically around the generic column names (Column 1) when I try to batch again to consolidate the codes for each email.

Example process exported and attached.

So looking for guidance on two things:
1. Is there an easier way to accomplish this process build without batching and the table > text > table steps?
2. If this is an efficient way to build this process, how would you recommend combining the column 2 values to be collapsed for each unique email address?

Tagged:

Answers