Split function not maintaining structure of dataframe?












0















I am doing hierarchical clustering in R and need all the cluster's elements separately.



When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)



clusterA <- hclust(dist(dataA),method = "single")
NumA = 3
label <- cutree(clusterA, NumA)

clusterXlist<-split(dataA,f=label)
str(clusterXlist[[1]])


how to make shure that it maintains the structure of dataA



edit:
in my case



>str(clusterXlist[[1]])


num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...


where as for dataA



> str(dataA)


num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...


edit2 :
for dataA



    > dput(head(dataA,n=20))
structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386,
-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946,
-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111,
-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056,
-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076,
-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324,
-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616,
0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372,
-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889,
-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152,
0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946,
0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987,
-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861,
-0.340276616984998, -0.573659982327726, 0.00475174748902491,
-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749,
-0.178576756619561, -0.494972916519322, -0.0435191938188023,
0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966,
-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904,
0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604,
0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565,
-0.526755422016323, -0.0240135648160378, -0.0762383134363428,
-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461,
1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445,
-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999,
0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301,
-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142
), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk",
"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))


for clusterXlist[[1]] which was obtained by split of dataA



> dput(head(clusterXlist[[1]],n=20))
c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332)









share|improve this question

























  • Can you provide an example? It's hard to understand what you are asking.

    – anotherfred
    Jan 19 at 18:34











  • i 've added it in question

    – saket sinha
    Jan 19 at 19:16











  • It's a start, saket, but I cannot take the output from your str(dataA) and do anything with it. Can you provide the output from dput(head(dataA,n=20)) (or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame, just matrices and lists, and the hclust and split functions return lists.

    – r2evans
    Jan 19 at 19:51













  • dataA is not a data frame but a numeric matrix by the looks of it. A dput(dataA) would be good.

    – Rich Scriven
    Jan 19 at 20:00













  • added dput for both

    – saket sinha
    Jan 19 at 20:08
















0















I am doing hierarchical clustering in R and need all the cluster's elements separately.



When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)



clusterA <- hclust(dist(dataA),method = "single")
NumA = 3
label <- cutree(clusterA, NumA)

clusterXlist<-split(dataA,f=label)
str(clusterXlist[[1]])


how to make shure that it maintains the structure of dataA



edit:
in my case



>str(clusterXlist[[1]])


num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...


where as for dataA



> str(dataA)


num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...


edit2 :
for dataA



    > dput(head(dataA,n=20))
structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386,
-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946,
-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111,
-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056,
-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076,
-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324,
-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616,
0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372,
-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889,
-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152,
0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946,
0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987,
-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861,
-0.340276616984998, -0.573659982327726, 0.00475174748902491,
-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749,
-0.178576756619561, -0.494972916519322, -0.0435191938188023,
0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966,
-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904,
0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604,
0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565,
-0.526755422016323, -0.0240135648160378, -0.0762383134363428,
-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461,
1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445,
-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999,
0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301,
-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142
), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk",
"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))


for clusterXlist[[1]] which was obtained by split of dataA



> dput(head(clusterXlist[[1]],n=20))
c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332)









share|improve this question

























  • Can you provide an example? It's hard to understand what you are asking.

    – anotherfred
    Jan 19 at 18:34











  • i 've added it in question

    – saket sinha
    Jan 19 at 19:16











  • It's a start, saket, but I cannot take the output from your str(dataA) and do anything with it. Can you provide the output from dput(head(dataA,n=20)) (or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame, just matrices and lists, and the hclust and split functions return lists.

    – r2evans
    Jan 19 at 19:51













  • dataA is not a data frame but a numeric matrix by the looks of it. A dput(dataA) would be good.

    – Rich Scriven
    Jan 19 at 20:00













  • added dput for both

    – saket sinha
    Jan 19 at 20:08














0












0








0








I am doing hierarchical clustering in R and need all the cluster's elements separately.



When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)



clusterA <- hclust(dist(dataA),method = "single")
NumA = 3
label <- cutree(clusterA, NumA)

clusterXlist<-split(dataA,f=label)
str(clusterXlist[[1]])


how to make shure that it maintains the structure of dataA



edit:
in my case



>str(clusterXlist[[1]])


num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...


where as for dataA



> str(dataA)


num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...


edit2 :
for dataA



    > dput(head(dataA,n=20))
structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386,
-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946,
-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111,
-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056,
-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076,
-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324,
-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616,
0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372,
-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889,
-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152,
0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946,
0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987,
-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861,
-0.340276616984998, -0.573659982327726, 0.00475174748902491,
-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749,
-0.178576756619561, -0.494972916519322, -0.0435191938188023,
0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966,
-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904,
0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604,
0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565,
-0.526755422016323, -0.0240135648160378, -0.0762383134363428,
-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461,
1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445,
-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999,
0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301,
-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142
), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk",
"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))


for clusterXlist[[1]] which was obtained by split of dataA



> dput(head(clusterXlist[[1]],n=20))
c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332)









share|improve this question
















I am doing hierarchical clustering in R and need all the cluster's elements separately.



When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)



clusterA <- hclust(dist(dataA),method = "single")
NumA = 3
label <- cutree(clusterA, NumA)

clusterXlist<-split(dataA,f=label)
str(clusterXlist[[1]])


how to make shure that it maintains the structure of dataA



edit:
in my case



>str(clusterXlist[[1]])


num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...


where as for dataA



> str(dataA)


num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...


edit2 :
for dataA



    > dput(head(dataA,n=20))
structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386,
-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946,
-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111,
-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056,
-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076,
-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324,
-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616,
0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372,
-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889,
-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152,
0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946,
0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987,
-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861,
-0.340276616984998, -0.573659982327726, 0.00475174748902491,
-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749,
-0.178576756619561, -0.494972916519322, -0.0435191938188023,
0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966,
-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904,
0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604,
0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565,
-0.526755422016323, -0.0240135648160378, -0.0762383134363428,
-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461,
1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445,
-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999,
0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301,
-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142
), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk",
"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))


for clusterXlist[[1]] which was obtained by split of dataA



> dput(head(clusterXlist[[1]],n=20))
c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332)






r






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 21 at 15:17









marc_s

575k12811111258




575k12811111258










asked Jan 19 at 18:29









saket sinhasaket sinha

74




74













  • Can you provide an example? It's hard to understand what you are asking.

    – anotherfred
    Jan 19 at 18:34











  • i 've added it in question

    – saket sinha
    Jan 19 at 19:16











  • It's a start, saket, but I cannot take the output from your str(dataA) and do anything with it. Can you provide the output from dput(head(dataA,n=20)) (or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame, just matrices and lists, and the hclust and split functions return lists.

    – r2evans
    Jan 19 at 19:51













  • dataA is not a data frame but a numeric matrix by the looks of it. A dput(dataA) would be good.

    – Rich Scriven
    Jan 19 at 20:00













  • added dput for both

    – saket sinha
    Jan 19 at 20:08



















  • Can you provide an example? It's hard to understand what you are asking.

    – anotherfred
    Jan 19 at 18:34











  • i 've added it in question

    – saket sinha
    Jan 19 at 19:16











  • It's a start, saket, but I cannot take the output from your str(dataA) and do anything with it. Can you provide the output from dput(head(dataA,n=20)) (or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame, just matrices and lists, and the hclust and split functions return lists.

    – r2evans
    Jan 19 at 19:51













  • dataA is not a data frame but a numeric matrix by the looks of it. A dput(dataA) would be good.

    – Rich Scriven
    Jan 19 at 20:00













  • added dput for both

    – saket sinha
    Jan 19 at 20:08

















Can you provide an example? It's hard to understand what you are asking.

– anotherfred
Jan 19 at 18:34





Can you provide an example? It's hard to understand what you are asking.

– anotherfred
Jan 19 at 18:34













i 've added it in question

– saket sinha
Jan 19 at 19:16





i 've added it in question

– saket sinha
Jan 19 at 19:16













It's a start, saket, but I cannot take the output from your str(dataA) and do anything with it. Can you provide the output from dput(head(dataA,n=20)) (or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame, just matrices and lists, and the hclust and split functions return lists.

– r2evans
Jan 19 at 19:51







It's a start, saket, but I cannot take the output from your str(dataA) and do anything with it. Can you provide the output from dput(head(dataA,n=20)) (or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame, just matrices and lists, and the hclust and split functions return lists.

– r2evans
Jan 19 at 19:51















dataA is not a data frame but a numeric matrix by the looks of it. A dput(dataA) would be good.

– Rich Scriven
Jan 19 at 20:00







dataA is not a data frame but a numeric matrix by the looks of it. A dput(dataA) would be good.

– Rich Scriven
Jan 19 at 20:00















added dput for both

– saket sinha
Jan 19 at 20:08





added dput for both

– saket sinha
Jan 19 at 20:08












1 Answer
1






active

oldest

votes


















0














What you have there is a matrix, not a data frame.



class(dataA)
# [1] "matrix"


The quick and easy way to split() would be to do



split(as.data.frame(dataA), label)


However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply() to split the data, as follows.



clusterXlist <- lapply(
unique(label),
function(i) dataA[label == i, , drop = FALSE]
)


to properly maintain your matrix structure throughout your list elements.



str(clusterXlist[[1]])
# num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...





share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54270129%2fsplit-function-not-maintaining-structure-of-dataframe%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    What you have there is a matrix, not a data frame.



    class(dataA)
    # [1] "matrix"


    The quick and easy way to split() would be to do



    split(as.data.frame(dataA), label)


    However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply() to split the data, as follows.



    clusterXlist <- lapply(
    unique(label),
    function(i) dataA[label == i, , drop = FALSE]
    )


    to properly maintain your matrix structure throughout your list elements.



    str(clusterXlist[[1]])
    # num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...
    # - attr(*, "dimnames")=List of 2
    # ..$ : NULL
    # ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...





    share|improve this answer






























      0














      What you have there is a matrix, not a data frame.



      class(dataA)
      # [1] "matrix"


      The quick and easy way to split() would be to do



      split(as.data.frame(dataA), label)


      However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply() to split the data, as follows.



      clusterXlist <- lapply(
      unique(label),
      function(i) dataA[label == i, , drop = FALSE]
      )


      to properly maintain your matrix structure throughout your list elements.



      str(clusterXlist[[1]])
      # num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...
      # - attr(*, "dimnames")=List of 2
      # ..$ : NULL
      # ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...





      share|improve this answer




























        0












        0








        0







        What you have there is a matrix, not a data frame.



        class(dataA)
        # [1] "matrix"


        The quick and easy way to split() would be to do



        split(as.data.frame(dataA), label)


        However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply() to split the data, as follows.



        clusterXlist <- lapply(
        unique(label),
        function(i) dataA[label == i, , drop = FALSE]
        )


        to properly maintain your matrix structure throughout your list elements.



        str(clusterXlist[[1]])
        # num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...
        # - attr(*, "dimnames")=List of 2
        # ..$ : NULL
        # ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...





        share|improve this answer















        What you have there is a matrix, not a data frame.



        class(dataA)
        # [1] "matrix"


        The quick and easy way to split() would be to do



        split(as.data.frame(dataA), label)


        However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply() to split the data, as follows.



        clusterXlist <- lapply(
        unique(label),
        function(i) dataA[label == i, , drop = FALSE]
        )


        to properly maintain your matrix structure throughout your list elements.



        str(clusterXlist[[1]])
        # num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...
        # - attr(*, "dimnames")=List of 2
        # ..$ : NULL
        # ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Jan 19 at 20:52

























        answered Jan 19 at 20:28









        Rich ScrivenRich Scriven

        76.4k8100171




        76.4k8100171






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54270129%2fsplit-function-not-maintaining-structure-of-dataframe%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Liquibase includeAll doesn't find base path

            How to use setInterval in EJS file?

            Petrus Granier-Deferre