Split function not maintaining structure of dataframe?

I am doing hierarchical clustering in R and need all the cluster's elements separately.

When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)

clusterA <- hclust(dist(dataA),method = "single")

NumA = 3

label <- cutree(clusterA, NumA)



clusterXlist<-split(dataA,f=label)

str(clusterXlist[[1]])

how to make shure that it maintains the structure of dataA

edit:
in my case

>str(clusterXlist[[1]])





num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...

where as for dataA

> str(dataA)





num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...

 - attr(*, "dimnames")=List of 2

  ..$ : NULL

  ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

 - attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...

  ..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

 - attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...

  ..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

edit2 :
for dataA

    > dput(head(dataA,n=20))

structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972, 

0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922, 

-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181, 

0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152, 

-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291, 

-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386, 

-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946, 

-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111, 

-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056, 

-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076, 

-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324, 

-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616, 

0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372, 

-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889, 

-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152, 

0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946, 

0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987, 

-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861, 

-0.340276616984998, -0.573659982327726, 0.00475174748902491, 

-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749, 

-0.178576756619561, -0.494972916519322, -0.0435191938188023, 

0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966, 

-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904, 

0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604, 

0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565, 

-0.526755422016323, -0.0240135648160378, -0.0762383134363428, 

-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461, 

1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445, 

-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999, 

0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301, 

-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142

), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk", 

"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))

for clusterXlist[[1]] which was obtained by split of dataA

> dput(head(clusterXlist[[1]],n=20))

c(0.0528730042415329, -0.390857056063646, -0.44652098379972, 

0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922, 

-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181, 

0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152, 

-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291, 

-0.333690834823332)

edited Jan 21 at 15:17

marc_s

575k12811111258

asked Jan 19 at 18:29

saket sinha

Can you provide an example? It's hard to understand what you are asking.

– anotherfred
Jan 19 at 18:34

i 've added it in question

– saket sinha
Jan 19 at 19:16

It's a start, saket, but I cannot take the output from your str(dataA) and do anything with it. Can you provide the output from dput(head(dataA,n=20)) (or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame, just matrices and lists, and the hclust and split functions return lists.

– r2evans
Jan 19 at 19:51

dataA is not a data frame but a numeric matrix by the looks of it. A dput(dataA) would be good.

– Rich Scriven
Jan 19 at 20:00

added dput for both

– saket sinha
Jan 19 at 20:08

|
show 3 more comments

I am doing hierarchical clustering in R and need all the cluster's elements separately.

When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)

clusterA <- hclust(dist(dataA),method = "single")

NumA = 3

label <- cutree(clusterA, NumA)



clusterXlist<-split(dataA,f=label)

str(clusterXlist[[1]])

how to make shure that it maintains the structure of dataA

edit:
in my case

>str(clusterXlist[[1]])





num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...

where as for dataA

> str(dataA)





num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...

 - attr(*, "dimnames")=List of 2

  ..$ : NULL

  ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

 - attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...

  ..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

 - attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...

  ..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

edit2 :
for dataA

    > dput(head(dataA,n=20))

structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972, 

0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922, 

-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181, 

0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152, 

-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291, 

-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386, 

-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946, 

-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111, 

-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056, 

-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076, 

-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324, 

-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616, 

0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372, 

-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889, 

-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152, 

0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946, 

0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987, 

-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861, 

-0.340276616984998, -0.573659982327726, 0.00475174748902491, 

-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749, 

-0.178576756619561, -0.494972916519322, -0.0435191938188023, 

0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966, 

-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904, 

0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604, 

0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565, 

-0.526755422016323, -0.0240135648160378, -0.0762383134363428, 

-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461, 

1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445, 

-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999, 

0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301, 

-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142

), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk", 

"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))

for clusterXlist[[1]] which was obtained by split of dataA

> dput(head(clusterXlist[[1]],n=20))

c(0.0528730042415329, -0.390857056063646, -0.44652098379972, 

0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922, 

-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181, 

0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152, 

-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291, 

-0.333690834823332)

edited Jan 21 at 15:17

marc_s

575k12811111258

asked Jan 19 at 18:29

saket sinha

Can you provide an example? It's hard to understand what you are asking.

– anotherfred
Jan 19 at 18:34

i 've added it in question

– saket sinha
Jan 19 at 19:16

It's a start, saket, but I cannot take the output from your str(dataA) and do anything with it. Can you provide the output from dput(head(dataA,n=20)) (or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame, just matrices and lists, and the hclust and split functions return lists.

– r2evans
Jan 19 at 19:51

dataA is not a data frame but a numeric matrix by the looks of it. A dput(dataA) would be good.

– Rich Scriven
Jan 19 at 20:00

added dput for both

– saket sinha
Jan 19 at 20:08

|
show 3 more comments

I am doing hierarchical clustering in R and need all the cluster's elements separately.

When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)

clusterA <- hclust(dist(dataA),method = "single")

NumA = 3

label <- cutree(clusterA, NumA)



clusterXlist<-split(dataA,f=label)

str(clusterXlist[[1]])

how to make shure that it maintains the structure of dataA

edit:
in my case

>str(clusterXlist[[1]])





num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...

where as for dataA

> str(dataA)





num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...

 - attr(*, "dimnames")=List of 2

  ..$ : NULL

  ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

 - attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...

  ..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

 - attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...

  ..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

edit2 :
for dataA

    > dput(head(dataA,n=20))

structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972, 

0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922, 

-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181, 

0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152, 

-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291, 

-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386, 

-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946, 

-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111, 

-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056, 

-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076, 

-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324, 

-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616, 

0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372, 

-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889, 

-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152, 

0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946, 

0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987, 

-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861, 

-0.340276616984998, -0.573659982327726, 0.00475174748902491, 

-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749, 

-0.178576756619561, -0.494972916519322, -0.0435191938188023, 

0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966, 

-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904, 

0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604, 

0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565, 

-0.526755422016323, -0.0240135648160378, -0.0762383134363428, 

-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461, 

1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445, 

-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999, 

0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301, 

-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142

), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk", 

"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))

for clusterXlist[[1]] which was obtained by split of dataA

> dput(head(clusterXlist[[1]],n=20))

c(0.0528730042415329, -0.390857056063646, -0.44652098379972, 

0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922, 

-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181, 

0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152, 

-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291, 

-0.333690834823332)

edited Jan 21 at 15:17

marc_s

575k12811111258

asked Jan 19 at 18:29

saket sinha

I am doing hierarchical clustering in R and need all the cluster's elements separately.

When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)

clusterA <- hclust(dist(dataA),method = "single")

NumA = 3

label <- cutree(clusterA, NumA)



clusterXlist<-split(dataA,f=label)

str(clusterXlist[[1]])

how to make shure that it maintains the structure of dataA

edit:
in my case

>str(clusterXlist[[1]])





num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...

where as for dataA

> str(dataA)





num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...

 - attr(*, "dimnames")=List of 2

  ..$ : NULL

  ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

 - attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...

  ..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

 - attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...

  ..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

edit2 :
for dataA

    > dput(head(dataA,n=20))

structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972, 

0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922, 

-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181, 

0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152, 

-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291, 

-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386, 

-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946, 

-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111, 

-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056, 

-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076, 

-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324, 

-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616, 

0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372, 

-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889, 

-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152, 

0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946, 

0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987, 

-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861, 

-0.340276616984998, -0.573659982327726, 0.00475174748902491, 

-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749, 

-0.178576756619561, -0.494972916519322, -0.0435191938188023, 

0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966, 

-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904, 

0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604, 

0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565, 

-0.526755422016323, -0.0240135648160378, -0.0762383134363428, 

-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461, 

1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445, 

-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999, 

0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301, 

-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142

), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk", 

"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))

for clusterXlist[[1]] which was obtained by split of dataA

> dput(head(clusterXlist[[1]],n=20))

c(0.0528730042415329, -0.390857056063646, -0.44652098379972, 

0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922, 

-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181, 

0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152, 

-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291, 

-0.333690834823332)

edited Jan 21 at 15:17

marc_s

575k12811111258

asked Jan 19 at 18:29

saket sinha

edited Jan 21 at 15:17

marc_s

575k12811111258

asked Jan 19 at 18:29

saket sinha

edited Jan 21 at 15:17

marc_s

575k12811111258

edited Jan 21 at 15:17

marc_s

575k12811111258

edited Jan 21 at 15:17

marc_s

575k12811111258

asked Jan 19 at 18:29

saket sinha

asked Jan 19 at 18:29

saket sinha

asked Jan 19 at 18:29

saket sinha

Can you provide an example? It's hard to understand what you are asking.

– anotherfred
Jan 19 at 18:34

i 've added it in question

– saket sinha
Jan 19 at 19:16

It's a start, saket, but I cannot take the output from your str(dataA) and do anything with it. Can you provide the output from dput(head(dataA,n=20)) (or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame, just matrices and lists, and the hclust and split functions return lists.

– r2evans
Jan 19 at 19:51

dataA is not a data frame but a numeric matrix by the looks of it. A dput(dataA) would be good.

– Rich Scriven
Jan 19 at 20:00

added dput for both

– saket sinha
Jan 19 at 20:08

|
show 3 more comments

Can you provide an example? It's hard to understand what you are asking.

– anotherfred
Jan 19 at 18:34

i 've added it in question

– saket sinha
Jan 19 at 19:16

It's a start, saket, but I cannot take the output from your str(dataA) and do anything with it. Can you provide the output from dput(head(dataA,n=20)) (or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame, just matrices and lists, and the hclust and split functions return lists.

– r2evans
Jan 19 at 19:51

dataA is not a data frame but a numeric matrix by the looks of it. A dput(dataA) would be good.

– Rich Scriven
Jan 19 at 20:00

added dput for both

– saket sinha
Jan 19 at 20:08

Can you provide an example? It's hard to understand what you are asking.

– anotherfred
Jan 19 at 18:34

i 've added it in question

– saket sinha
Jan 19 at 19:16

It's a start, saket, but I cannot take the output from your str(dataA) and do anything with it. Can you provide the output from dput(head(dataA,n=20)) (or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame, just matrices and lists, and the hclust and split functions return lists.

– r2evans
Jan 19 at 19:51

dataA is not a data frame but a numeric matrix by the looks of it. A dput(dataA) would be good.

– Rich Scriven
Jan 19 at 20:00

added dput for both

– saket sinha
Jan 19 at 20:08

|
show 3 more comments

1 Answer
1

active

oldest

votes

What you have there is a matrix, not a data frame.

class(dataA)

# [1] "matrix"

The quick and easy way to split() would be to do

split(as.data.frame(dataA), label)

However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply() to split the data, as follows.

clusterXlist <- lapply(

    unique(label), 

    function(i) dataA[label == i, , drop = FALSE]

)

to properly maintain your matrix structure throughout your list elements.

str(clusterXlist[[1]])

# num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...

# - attr(*, "dimnames")=List of 2

#  ..$ : NULL

#  ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

edited Jan 19 at 20:52

answered Jan 19 at 20:28

Rich Scriven

76.4k8100171

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54270129%2fsplit-function-not-maintaining-structure-of-dataframe%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

What you have there is a matrix, not a data frame.

class(dataA)

# [1] "matrix"

The quick and easy way to split() would be to do

split(as.data.frame(dataA), label)

However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply() to split the data, as follows.

clusterXlist <- lapply(

    unique(label), 

    function(i) dataA[label == i, , drop = FALSE]

)

to properly maintain your matrix structure throughout your list elements.

str(clusterXlist[[1]])

# num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...

# - attr(*, "dimnames")=List of 2

#  ..$ : NULL

#  ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

edited Jan 19 at 20:52

answered Jan 19 at 20:28

Rich Scriven

76.4k8100171

add a comment |

What you have there is a matrix, not a data frame.

class(dataA)

# [1] "matrix"

The quick and easy way to split() would be to do

split(as.data.frame(dataA), label)

However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply() to split the data, as follows.

clusterXlist <- lapply(

    unique(label), 

    function(i) dataA[label == i, , drop = FALSE]

)

to properly maintain your matrix structure throughout your list elements.

str(clusterXlist[[1]])

# num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...

# - attr(*, "dimnames")=List of 2

#  ..$ : NULL

#  ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

edited Jan 19 at 20:52

answered Jan 19 at 20:28

Rich Scriven

76.4k8100171

add a comment |

What you have there is a matrix, not a data frame.

class(dataA)

# [1] "matrix"

The quick and easy way to split() would be to do

split(as.data.frame(dataA), label)

However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply() to split the data, as follows.

clusterXlist <- lapply(

    unique(label), 

    function(i) dataA[label == i, , drop = FALSE]

)

to properly maintain your matrix structure throughout your list elements.

str(clusterXlist[[1]])

# num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...

# - attr(*, "dimnames")=List of 2

#  ..$ : NULL

#  ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

edited Jan 19 at 20:52

answered Jan 19 at 20:28

Rich Scriven

76.4k8100171

What you have there is a matrix, not a data frame.

class(dataA)

# [1] "matrix"

The quick and easy way to split() would be to do

split(as.data.frame(dataA), label)

However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply() to split the data, as follows.

clusterXlist <- lapply(

    unique(label), 

    function(i) dataA[label == i, , drop = FALSE]

)

to properly maintain your matrix structure throughout your list elements.

str(clusterXlist[[1]])

# num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...

# - attr(*, "dimnames")=List of 2

#  ..$ : NULL

#  ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...

edited Jan 19 at 20:52

answered Jan 19 at 20:28

Rich Scriven

76.4k8100171

edited Jan 19 at 20:52

answered Jan 19 at 20:28

Rich Scriven

76.4k8100171

answered Jan 19 at 20:28

Rich Scriven

76.4k8100171

answered Jan 19 at 20:28

Rich Scriven

76.4k8100171

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Brtdku