My data below background:
This is question&answer data
Each Section_ID have 3 respondents
I would like to divide each “Question” and “Answer” into 3 separate columns in order to see the consensus between 3
Under “Section_ID”, each respondents’ questions and answers are unique
Other:
Question’s data is equal to Answer’s data (Question=Answer)
Each Section_ID&Interview_date has the same value repeated
==================================================== What I would like to do:
Create new columns (divide each “Question” and “Answer” into 3 separate columns, original Section_ID and Interview_date can remain the same)
Assign Question&Answer’s values into different columns
I would like Section_ID and Interview_date aligned the longst value;
Add blank if the columns(Question&Answer) doesn’t have any more values
Thank you for reading!
import pandas as pd
dic = {'Question_ID': ["AD","AD","AD","HG","HG","JI","JI","JI","JI","OP","RT","ER","ER","ER","ER"],
'Answer_ID': ['BB', 'BB', 'BB', 'GG', 'GG', 'UI', 'UI', 'UI', 'UI', 'LL', 'TT', 'RR', 'RR', 'RR', 'RR'],
'Section_ID': ['X'] * 9 + ['Z'] * 6,
'Person_Incharge': ['Y'] * 9 + ['Q'] * 6,}
df = pd.DataFrame(dic)
df
# dfx = df.set_index(["Section_ID", "Person_Incharge"]).unstack(["Question_ID","Answer_ID"])
Question_ID | Answer_ID | Section_ID | Person_Incharge | |
---|---|---|---|---|
0 | AD | BB | X | Y |
1 | AD | BB | X | Y |
2 | AD | BB | X | Y |
3 | HG | GG | X | Y |
4 | HG | GG | X | Y |
5 | JI | UI | X | Y |
6 | JI | UI | X | Y |
7 | JI | UI | X | Y |
8 | JI | UI | X | Y |
9 | OP | LL | Z | Q |
10 | RT | TT | Z | Q |
11 | ER | RR | Z | Q |
12 | ER | RR | Z | Q |
13 | ER | RR | Z | Q |
14 | ER | RR | Z | Q |
df_split_questions = df.copy()
question_ids = ['AD', 'HG', 'JI', 'OP', 'RT', 'ER']
for id in question_ids:
df_split_questions['Question_ID_' + id] = df['Question_ID']==id
df_split_questions
Question_ID | Answer_ID | Section_ID | Person_Incharge | Question_ID_AD | Question_ID_HG | Question_ID_JI | Question_ID_OP | Question_ID_RT | Question_ID_ER | |
---|---|---|---|---|---|---|---|---|---|---|
0 | AD | BB | X | Y | True | False | False | False | False | False |
1 | AD | BB | X | Y | True | False | False | False | False | False |
2 | AD | BB | X | Y | True | False | False | False | False | False |
3 | HG | GG | X | Y | False | True | False | False | False | False |
4 | HG | GG | X | Y | False | True | False | False | False | False |
5 | JI | UI | X | Y | False | False | True | False | False | False |
6 | JI | UI | X | Y | False | False | True | False | False | False |
7 | JI | UI | X | Y | False | False | True | False | False | False |
8 | JI | UI | X | Y | False | False | True | False | False | False |
9 | OP | LL | Z | Q | False | False | False | True | False | False |
10 | RT | TT | Z | Q | False | False | False | False | True | False |
11 | ER | RR | Z | Q | False | False | False | False | False | True |
12 | ER | RR | Z | Q | False | False | False | False | False | True |
13 | ER | RR | Z | Q | False | False | False | False | False | True |
14 | ER | RR | Z | Q | False | False | False | False | False | True |
df_split_answers = df_split_questions.copy()
answer_ids = ['BB', 'GG', 'UI', 'LL', 'RR']
for id in answer_ids:
df_split_answers['Answer_ID_' + id] = df['Answer_ID']==id
df_split_answers.drop(['Question_ID', 'Answer_ID'], axis=1)
Section_ID | Person_Incharge | Question_ID_AD | Question_ID_HG | Question_ID_JI | Question_ID_OP | Question_ID_RT | Question_ID_ER | Answer_ID_BB | Answer_ID_GG | Answer_ID_UI | Answer_ID_LL | Answer_ID_RR | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | X | Y | True | False | False | False | False | False | True | False | False | False | False |
1 | X | Y | True | False | False | False | False | False | True | False | False | False | False |
2 | X | Y | True | False | False | False | False | False | True | False | False | False | False |
3 | X | Y | False | True | False | False | False | False | False | True | False | False | False |
4 | X | Y | False | True | False | False | False | False | False | True | False | False | False |
5 | X | Y | False | False | True | False | False | False | False | False | True | False | False |
6 | X | Y | False | False | True | False | False | False | False | False | True | False | False |
7 | X | Y | False | False | True | False | False | False | False | False | True | False | False |
8 | X | Y | False | False | True | False | False | False | False | False | True | False | False |
9 | Z | Q | False | False | False | True | False | False | False | False | False | True | False |
10 | Z | Q | False | False | False | False | True | False | False | False | False | False | False |
11 | Z | Q | False | False | False | False | False | True | False | False | False | False | True |
12 | Z | Q | False | False | False | False | False | True | False | False | False | False | True |
13 | Z | Q | False | False | False | False | False | True | False | False | False | False | True |
14 | Z | Q | False | False | False | False | False | True | False | False | False | False | True |