Split function not maintaining structure of dataframe?
I am doing hierarchical clustering in R and need all the cluster's elements separately.
When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)
clusterA <- hclust(dist(dataA),method = "single")
NumA = 3
label <- cutree(clusterA, NumA)
clusterXlist<-split(dataA,f=label)
str(clusterXlist[[1]])
how to make shure that it maintains the structure of dataA
edit:
in my case
>str(clusterXlist[[1]])
num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
where as for dataA
> str(dataA)
num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
edit2 :
for dataA
> dput(head(dataA,n=20))
structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386,
-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946,
-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111,
-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056,
-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076,
-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324,
-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616,
0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372,
-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889,
-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152,
0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946,
0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987,
-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861,
-0.340276616984998, -0.573659982327726, 0.00475174748902491,
-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749,
-0.178576756619561, -0.494972916519322, -0.0435191938188023,
0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966,
-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904,
0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604,
0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565,
-0.526755422016323, -0.0240135648160378, -0.0762383134363428,
-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461,
1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445,
-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999,
0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301,
-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142
), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk",
"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))
for clusterXlist[[1]] which was obtained by split of dataA
> dput(head(clusterXlist[[1]],n=20))
c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332)
r
|
show 3 more comments
I am doing hierarchical clustering in R and need all the cluster's elements separately.
When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)
clusterA <- hclust(dist(dataA),method = "single")
NumA = 3
label <- cutree(clusterA, NumA)
clusterXlist<-split(dataA,f=label)
str(clusterXlist[[1]])
how to make shure that it maintains the structure of dataA
edit:
in my case
>str(clusterXlist[[1]])
num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
where as for dataA
> str(dataA)
num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
edit2 :
for dataA
> dput(head(dataA,n=20))
structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386,
-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946,
-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111,
-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056,
-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076,
-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324,
-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616,
0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372,
-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889,
-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152,
0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946,
0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987,
-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861,
-0.340276616984998, -0.573659982327726, 0.00475174748902491,
-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749,
-0.178576756619561, -0.494972916519322, -0.0435191938188023,
0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966,
-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904,
0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604,
0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565,
-0.526755422016323, -0.0240135648160378, -0.0762383134363428,
-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461,
1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445,
-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999,
0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301,
-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142
), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk",
"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))
for clusterXlist[[1]] which was obtained by split of dataA
> dput(head(clusterXlist[[1]],n=20))
c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332)
r
Can you provide an example? It's hard to understand what you are asking.
– anotherfred
Jan 19 at 18:34
i 've added it in question
– saket sinha
Jan 19 at 19:16
It's a start, saket, but I cannot take the output from yourstr(dataA)
and do anything with it. Can you provide the output fromdput(head(dataA,n=20))
(or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see adata.frame
, just matrices and lists, and thehclust
andsplit
functions returnlist
s.
– r2evans
Jan 19 at 19:51
dataA
is not a data frame but a numeric matrix by the looks of it. Adput(dataA)
would be good.
– Rich Scriven
Jan 19 at 20:00
added dput for both
– saket sinha
Jan 19 at 20:08
|
show 3 more comments
I am doing hierarchical clustering in R and need all the cluster's elements separately.
When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)
clusterA <- hclust(dist(dataA),method = "single")
NumA = 3
label <- cutree(clusterA, NumA)
clusterXlist<-split(dataA,f=label)
str(clusterXlist[[1]])
how to make shure that it maintains the structure of dataA
edit:
in my case
>str(clusterXlist[[1]])
num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
where as for dataA
> str(dataA)
num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
edit2 :
for dataA
> dput(head(dataA,n=20))
structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386,
-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946,
-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111,
-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056,
-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076,
-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324,
-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616,
0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372,
-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889,
-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152,
0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946,
0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987,
-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861,
-0.340276616984998, -0.573659982327726, 0.00475174748902491,
-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749,
-0.178576756619561, -0.494972916519322, -0.0435191938188023,
0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966,
-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904,
0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604,
0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565,
-0.526755422016323, -0.0240135648160378, -0.0762383134363428,
-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461,
1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445,
-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999,
0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301,
-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142
), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk",
"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))
for clusterXlist[[1]] which was obtained by split of dataA
> dput(head(clusterXlist[[1]],n=20))
c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332)
r
I am doing hierarchical clustering in R and need all the cluster's elements separately.
When I use following data splits into 3 list of num [1:2628] (no info of columns in original dataframe (dataA) is transferred)
clusterA <- hclust(dist(dataA),method = "single")
NumA = 3
label <- cutree(clusterA, NumA)
clusterXlist<-split(dataA,f=label)
str(clusterXlist[[1]])
how to make shure that it maintains the structure of dataA
edit:
in my case
>str(clusterXlist[[1]])
num [1:2628] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
where as for dataA
> str(dataA)
num [1:440, 1:6] 0.0529 -0.3909 -0.4465 0.1 0.8393 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:center")= Named num [1:6] 12000 5796 7951 3072 2881 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
- attr(*, "scaled:scale")= Named num [1:6] 12647 7380 9503 4855 4768 ...
..- attr(*, "names")= chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
edit2 :
for dataA
> dput(head(dataA,n=20))
structure(c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332, 0.522972471408079, 0.543838613660349, 0.408073194590386,
-0.623310408164662, -0.0523368792616442, 0.333686752405346, -0.351915064454946,
-0.113851350576777, -0.291078065290861, 0.717677967619194, -0.053285340273111,
-0.63306600713975, 0.883794139056095, 0.0557876760455718, 0.497093035238056,
-0.634420951441845, 0.409157150032062, 0.0488774601048851, 0.0719115132405076,
-0.447303143322465, -0.0410681453901357, 0.170124700204028, -0.0281250860936324,
-0.3925300807586, -0.0792659545334748, -0.297298628211157, -0.10273182626616,
0.15518230654465, -0.185125447641461, 1.15011422238562, 0.528531691780372,
-0.360751187201331, 0.400469064432042, 0.739829765498898, 0.435615257968889,
-0.434621330503326, 0.438772101699743, -0.528063904936618, 0.226000834240152,
0.159180975270399, -0.588697039406295, -0.269829034507317, -0.137379339965946,
0.68636300602308, 0.173661155768845, -0.495590877769126, -0.533904475256987,
-0.288985833251248, -0.545233764836731, -0.394039245717966, 0.273564891153861,
-0.340276616984998, -0.573659982327726, 0.00475174748902491,
-0.572218072744849, -0.551001403168238, -0.605176006067741, -0.459955112363749,
-0.178576756619561, -0.494972916519322, -0.0435191938188023,
0.0863085949200282, 0.13308015693741, -0.498021323377842, -0.23165413161966,
-0.227878848586867, 0.0542186891412866, 0.0921812574154842, -0.244448146341904,
0.952945788892319, 0.649245242698738, -0.489212329634658, 0.209634507324604,
0.802353943473126, 0.456496070080021, -0.40217108193415, 0.341140199633565,
-0.526755422016323, -0.0240135648160378, -0.0762383134363428,
-0.066263629344282, 0.0890496850231094, 2.24074190324533, 0.0933048443208461,
1.29786952218849, -0.0261942126239276, -0.347458739603052, 0.369181005457445,
-0.274766434933383, 0.203229792845712, 0.0777025935624781, -0.364479376793999,
0.498608767430271, -0.327246732938803, 0.228051555415843, -0.394620088486301,
-0.157749554245622, 1.04716972023017, 0.587257919466454, -0.36306099036142
), .Dim = c(20L, 6L), .Dimnames = list(NULL, c("Fresh", "Milk",
"Grocery", "Frozen", "Detergents_Paper", "Delicassen")))
for clusterXlist[[1]] which was obtained by split of dataA
> dput(head(clusterXlist[[1]],n=20))
c(0.0528730042415329, -0.390857056063646, -0.44652098379972,
0.0999975794271863, 0.839284119671916, -0.204572661537808, 0.00993903725191922,
-0.349583518736614, -0.477357534676238, -0.473957607271904, -0.682697336282181,
0.0905884780058897, 1.55872457204484, 0.728746944991474, 1.00042486502152,
-0.138155475034538, -0.868191050016313, -0.484236457564077, 0.521904849881291,
-0.333690834823332)
r
r
edited Jan 21 at 15:17
marc_s
575k12811111258
575k12811111258
asked Jan 19 at 18:29
saket sinhasaket sinha
74
74
Can you provide an example? It's hard to understand what you are asking.
– anotherfred
Jan 19 at 18:34
i 've added it in question
– saket sinha
Jan 19 at 19:16
It's a start, saket, but I cannot take the output from yourstr(dataA)
and do anything with it. Can you provide the output fromdput(head(dataA,n=20))
(or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see adata.frame
, just matrices and lists, and thehclust
andsplit
functions returnlist
s.
– r2evans
Jan 19 at 19:51
dataA
is not a data frame but a numeric matrix by the looks of it. Adput(dataA)
would be good.
– Rich Scriven
Jan 19 at 20:00
added dput for both
– saket sinha
Jan 19 at 20:08
|
show 3 more comments
Can you provide an example? It's hard to understand what you are asking.
– anotherfred
Jan 19 at 18:34
i 've added it in question
– saket sinha
Jan 19 at 19:16
It's a start, saket, but I cannot take the output from yourstr(dataA)
and do anything with it. Can you provide the output fromdput(head(dataA,n=20))
(or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see adata.frame
, just matrices and lists, and thehclust
andsplit
functions returnlist
s.
– r2evans
Jan 19 at 19:51
dataA
is not a data frame but a numeric matrix by the looks of it. Adput(dataA)
would be good.
– Rich Scriven
Jan 19 at 20:00
added dput for both
– saket sinha
Jan 19 at 20:08
Can you provide an example? It's hard to understand what you are asking.
– anotherfred
Jan 19 at 18:34
Can you provide an example? It's hard to understand what you are asking.
– anotherfred
Jan 19 at 18:34
i 've added it in question
– saket sinha
Jan 19 at 19:16
i 've added it in question
– saket sinha
Jan 19 at 19:16
It's a start, saket, but I cannot take the output from your
str(dataA)
and do anything with it. Can you provide the output from dput(head(dataA,n=20))
(or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame
, just matrices and lists, and the hclust
and split
functions return list
s.– r2evans
Jan 19 at 19:51
It's a start, saket, but I cannot take the output from your
str(dataA)
and do anything with it. Can you provide the output from dput(head(dataA,n=20))
(or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see a data.frame
, just matrices and lists, and the hclust
and split
functions return list
s.– r2evans
Jan 19 at 19:51
dataA
is not a data frame but a numeric matrix by the looks of it. A dput(dataA)
would be good.– Rich Scriven
Jan 19 at 20:00
dataA
is not a data frame but a numeric matrix by the looks of it. A dput(dataA)
would be good.– Rich Scriven
Jan 19 at 20:00
added dput for both
– saket sinha
Jan 19 at 20:08
added dput for both
– saket sinha
Jan 19 at 20:08
|
show 3 more comments
1 Answer
1
active
oldest
votes
What you have there is a matrix, not a data frame.
class(dataA)
# [1] "matrix"
The quick and easy way to split()
would be to do
split(as.data.frame(dataA), label)
However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply()
to split the data, as follows.
clusterXlist <- lapply(
unique(label),
function(i) dataA[label == i, , drop = FALSE]
)
to properly maintain your matrix structure throughout your list elements.
str(clusterXlist[[1]])
# num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54270129%2fsplit-function-not-maintaining-structure-of-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
What you have there is a matrix, not a data frame.
class(dataA)
# [1] "matrix"
The quick and easy way to split()
would be to do
split(as.data.frame(dataA), label)
However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply()
to split the data, as follows.
clusterXlist <- lapply(
unique(label),
function(i) dataA[label == i, , drop = FALSE]
)
to properly maintain your matrix structure throughout your list elements.
str(clusterXlist[[1]])
# num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
add a comment |
What you have there is a matrix, not a data frame.
class(dataA)
# [1] "matrix"
The quick and easy way to split()
would be to do
split(as.data.frame(dataA), label)
However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply()
to split the data, as follows.
clusterXlist <- lapply(
unique(label),
function(i) dataA[label == i, , drop = FALSE]
)
to properly maintain your matrix structure throughout your list elements.
str(clusterXlist[[1]])
# num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
add a comment |
What you have there is a matrix, not a data frame.
class(dataA)
# [1] "matrix"
The quick and easy way to split()
would be to do
split(as.data.frame(dataA), label)
However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply()
to split the data, as follows.
clusterXlist <- lapply(
unique(label),
function(i) dataA[label == i, , drop = FALSE]
)
to properly maintain your matrix structure throughout your list elements.
str(clusterXlist[[1]])
# num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
What you have there is a matrix, not a data frame.
class(dataA)
# [1] "matrix"
The quick and easy way to split()
would be to do
split(as.data.frame(dataA), label)
However, this may cause issues in later calculations and you may need to resort to coercing those list elements back to a matrix. I would recommend you use lapply()
to split the data, as follows.
clusterXlist <- lapply(
unique(label),
function(i) dataA[label == i, , drop = FALSE]
)
to properly maintain your matrix structure throughout your list elements.
str(clusterXlist[[1]])
# num [1:18, 1:6] 0.0529 -0.3909 0.1 0.8393 -0.2046 ...
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr [1:6] "Fresh" "Milk" "Grocery" "Frozen" ...
edited Jan 19 at 20:52
answered Jan 19 at 20:28
Rich ScrivenRich Scriven
76.4k8100171
76.4k8100171
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54270129%2fsplit-function-not-maintaining-structure-of-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Can you provide an example? It's hard to understand what you are asking.
– anotherfred
Jan 19 at 18:34
i 've added it in question
– saket sinha
Jan 19 at 19:16
It's a start, saket, but I cannot take the output from your
str(dataA)
and do anything with it. Can you provide the output fromdput(head(dataA,n=20))
(or some meaningful number of rows)? It provides an unambiguous and reproducible set of data. And if it matters, nowhere in this question do I see adata.frame
, just matrices and lists, and thehclust
andsplit
functions returnlist
s.– r2evans
Jan 19 at 19:51
dataA
is not a data frame but a numeric matrix by the looks of it. Adput(dataA)
would be good.– Rich Scriven
Jan 19 at 20:00
added dput for both
– saket sinha
Jan 19 at 20:08