Here, we’re going to read in a fcs file line by line and see what’s there. We’re reading in Stanford HIMC data from here.
fcs_file <- "081216-Mike-HIMC ctrls-001_01_normalized.fcs"
dat <- readLines(fcs_file)
The first line is the HEADER segment and the TEXT segment. The DATA segment makes up the subsequent lines. Those are the three segments that make up a FCS file.
So starting with the HEADER and TEXT segment:
dat[1]
## [1] "FCS3.0 100 3385 338549003385 0 0 \\$BEGINANALYSIS\\0\\$ENDANALYSIS\\0\\$BEGINSTEXT\\0\\$ENDSTEXT\\0\\$NEXTDATA\\0\\$TOT\\250000\\$PAR\\49\\FCSversion\\3\\CREATOR\\PengQiu FCS writer\\$COM\\PengQiu FCS writer\\FILENAME\\081216-Mike-HIMC ctrls-001_01_normalized.fcs\\GUID\\1.fcs\\ORIGINALGUID\\1.fcs\\$BYTEORD\\4,3,2,1\\$DATATYPE\\F\\$MODE\\L\\$BTIM\\10:44:05\\$ETIM\\11:03:20\\$CYT\\DVSSCIENCES-CYTOF-6.5.358\\$DATE\\12-Aug-2016\\$CYTSN\\Helios\\$P1B\\32\\$P1N\\Time\\$P1R\\1129870\\$P1E\\0,0\\$P2B\\32\\$P2N\\Event_length\\$P2R\\68\\$P2E\\0,0\\$P3B\\32\\$P3N\\In113Di\\$P3S\\113In_CD57\\$P3R\\7887\\$P3E\\0,0\\$P4B\\32\\$P4N\\In115Di\\$P4S\\115In_Dead\\$P4R\\7527\\$P4E\\0,0\\$P5B\\32\\$P5N\\Sn120Di\\$P5S\\120Sn\\$P5R\\16821\\$P5E\\0,0\\$P6B\\32\\$P6N\\I127Di\\$P6S\\127I\\$P6R\\10781\\$P6E\\0,0\\$P7B\\32\\$P7N\\Xe131Di\\$P7S\\131Xe\\$P7R\\665\\$P7E\\0,0\\$P8B\\32\\$P8N\\Ba138Di\\$P8S\\138Ba\\$P8R\\18370\\$P8E\\0,0\\$P9B\\32\\$P9N\\Ce140Di\\$P9S\\140Ce_Bead\\$P9R\\8547\\$P9E\\0,0\\$P10B\\32\\$P10N\\Nd142Di\\$P10S\\142Nd_CD19\\$P10R\\3094\\$P10E\\0,0\\$P11B\\32\\$P11N\\Nd143Di\\$P11S\\143Nd_CD4\\$P11R\\679\\$P11E\\0,0\\$P12B\\32\\$P12N\\Nd144Di\\$P12S\\144Nd_CD8\\$P12R\\1678\\$P12E\\0,0\\$P13B\\32\\$P13N\\Nd146Di\\$P13S\\146Nd_IgD\\$P13R\\1770\\$P13E\\0,0\\$P14B\\32\\$P14N\\Sm147Di\\$P14S\\147Sm_CD85j\\$P14R\\2155\\$P14E\\0,0\\$P15B\\32\\$P15N\\Nd148Di\\$P15S\\148Nd_CD11c\\$P15R\\1561\\$P15E\\0,0\\$P16B\\32\\$P16N\\Sm149Di\\$P16S\\149Sm_CD16\\$P16R\\3670\\$P16E\\0,0\\$P17B\\32\\$P17N\\Nd150Di\\$P17S\\150Nd_CD3\\$P17R\\1980\\$P17E\\0,0\\$P18B\\32\\$P18N\\Eu151Di\\$P18S\\151Eu_CD38\\$P18R\\7181\\$P18E\\0,0\\$P19B\\32\\$P19N\\Sm152Di\\$P19S\\152Sm_CD27\\$P19R\\1582\\$P19E\\0,0\\$P20B\\32\\$P20N\\Eu153Di\\$P20S\\153Eu_CD11b\\$P20R\\7941\\$P20E\\0,0\\$P21B\\32\\$P21N\\Sm154Di\\$P21S\\154Sm_CD14\\$P21R\\1910\\$P21E\\0,0\\$P22B\\32\\$P22N\\Gd155Di\\$P22S\\155Gd_CCR6\\$P22R\\8951\\$P22E\\0,0\\$P23B\\32\\$P23N\\Gd156Di\\$P23S\\156Gd_CD94\\$P23R\\2121\\$P23E\\0,0\\$P24B\\32\\$P24N\\Gd157Di\\$P24S\\157Gd_CD86\\$P24R\\5504\\$P24E\\0,0\\$P25B\\32\\$P25N\\Gd158Di\\$P25S\\158Gd_CXCR5\\$P25R\\1520\\$P25E\\0,0\\$P26B\\32\\$P26N\\Tb159Di\\$P26S\\159Tb_CXCR3\\$P26R\\641\\$P26E\\0,0\\$P27B\\32\\$P27N\\Gd160Di\\$P27S\\160Gd_CCR7\\$P27R\\673\\$P27E\\0,0\\$P28B\\32\\$P28N\\Dy162Di\\$P28S\\162Dy_CD45RA\\$P28R\\4259\\$P28E\\0,0\\$P29B\\32\\$P29N\\Dy164Di\\$P29S\\164Dy_CD20\\$P29R\\6982\\$P29E\\0,0\\$P30B\\32\\$P30N\\Ho165Di\\$P30S\\165Ho_CD127\\$P30R\\7787\\$P30E\\0,0\\$P31B\\32\\$P31N\\Er166Di\\$P31S\\166Er_CD33\\$P31R\\1067\\$P31E\\0,0\\$P32B\\32\\$P32N\\Er167Di\\$P32S\\167Er_CD28\\$P32R\\1653\\$P32E\\0,0\\$P33B\\32\\$P33N\\Er168Di\\$P33S\\168Er_CD24\\$P33R\\803\\$P33E\\0,0\\$P34B\\32\\$P34N\\Tm169Di\\$P34S\\169Tm_ICOS\\$P34R\\401\\$P34E\\0,0\\$P35B\\32\\$P35N\\Er170Di\\$P35S\\170Er_CD161\\$P35R\\498\\$P35E\\0,0\\$P36B\\32\\$P36N\\Yb171Di\\$P36S\\171Yb_TCRgd\\$P36R\\349\\$P36E\\0,0\\$P37B\\32\\$P37N\\Yb172Di\\$P37S\\172Yb_PD-1\\$P37R\\790\\$P37E\\0,0\\$P38B\\32\\$P38N\\Yb173Di\\$P38S\\173Yb_CD123\\$P38R\\1368\\$P38E\\0,0\\$P39B\\32\\$P39N\\Yb174Di\\$P39S\\174Yb_CD56\\$P39R\\1868\\$P39E\\0,0\\$P40B\\32\\$P40N\\Lu175Di\\$P40S\\175Lu_HLADR\\$P40R\\9669\\$P40E\\0,0\\$P41B\\32\\$P41N\\Yb176Di\\$P41S\\176Yb_CD25\\$P41R\\1271\\$P41E\\0,0\\$P42B\\32\\$P42N\\BCKG190Di\\$P42S\\190BCKG\\$P42R\\130\\$P42E\\0,0\\$P43B\\32\\$P43N\\Ir191Di\\$P43S\\191Ir_DNA1\\$P43R\\32196\\$P43E\\0,0\\$P44B\\32\\$P44N\\Ir193Di\\$P44S\\193Ir_DNA2\\$P44R\\35373\\$P44E\\0,0\\$P45B\\32\\$P45N\\Pb208Di\\$P45S\\208Pb\\$P45R\\30923\\$P45E\\0,0\\$P46B\\32\\$P46N\\Center\\$P46R\\4621\\$P46E\\0,0\\$P47B\\32\\$P47N\\Offset\\$P47R\\512\\$P47E\\0,0\\$P48B\\32\\$P48N\\Width\\$P48R\\229\\$P48E\\0,0\\$P49B\\32\\$P49N\\Residual\\$P49R\\580\\$P49E\\0,0\\$BEGINDATA\\3385\\$ENDDATA\\49003385\\ A\037\x95\x81A`"
Ok, so what do we have? We start with the HEADER segment.
FCS3.0 100 3385 338549003385 0 0
Then we have a TEXT segment. That starts with:
\\$BEGINANALYSIS\\0\\$ENDANALYSIS\\0\\$BEGINSTEXT\\0\\$ENDSTEXT\\0\\$NEXTDATA\\0\\$TOT\\250000\\$PAR\\49\\FCSversion\\3\\CREATOR\\PengQiu FCS writer\\$COM\\PengQiu FCS writer\\FILENAME\\081216-Mike-HIMC ctrls-001_01_normalized.fcs\
A bit later, we find:
$CYT\\DVSSCIENCES-CYTOF-6.5.358
DVS? The original company that made CyTOF. They were acquired by Fluidigm, who is now StandardBioTools. All of this happened in 2014.
When was this file created?
$DATE\\12-Aug-2016
2 years after the acquisition.
What else do we have here? We know it’s a Helios:
$CYTSN\\Helios
And then we have the parameter data, written as $P, followed by a number. For example, parameter 1 is time.
$P1B\\32\\$P1N\\Time\\$P1R\\1129870\\$P1E\\0,0\\
Parameter 10 is CD19.
$P10B\\32\\$P10N\\Nd142Di\\$P10S\\142Nd_CD19\\$P10R\\3094\\$P10E\\0,0\\
Where does the DATA segment begin? Found it. In the above piece, it
tells you $BEGINDATA\\3385.
That means byte 3385.
We have to read that in as integers. Because it was read in as text here, it’s going to be illegible, but we’ll look at it anyway.
dat[2:10]
## [1] "[#A\x80"
## [2] ""
## [3] ","
## [4] "`U@\x84d\003"
## [5] "\a"
## [6] "B\xd8\xe6\xe9Ap"
## [7] "\017@[GF"
## [8] "\xa1\xacC\b1\x87?\xb49\xc8>oJ\xd2@\xb1h\xbdD\003\xf3\xaaA\003\xed$B\xfc/'?\xba\x8d\xa1Ab\xfa\xca"
## [9] "=\x89\x9c\xfbD\x9f\xe0\xd1E"
This is binary that needs to be read in as numeric values, not text values, in order to be legible. But let’s try to read in the binary ourselves.
We have to start at byte 3385. Let’s look at the relevant data information, which we have already read in.
$BYTEORD\\4,3,2,1\\
This is the byte order called big endian. This is
the byte order of the data we’re going to look at. If we were to think
of a byte in terms of a 8-digit (analogy here) number, big endian stores
the 1s place on the left side rather than the right side. So the number
is flipped. Little endian stores the 1s place on the right side.
Of note, GPT-4 told me that this is little endian, but the flowCore codebase suggests otherwise. I would guess that there is some confusion in the field for how endien-ness of byte order is specified.
$DATATYPE\\F\\
This says that the data type is a float.
For all practical purposes, think of this as a number that can include
decimal points. An integer, by comparison, cannot.
$MODE\\L
This means that the data are stored as a big
continuous list.
# Make the file connection.
file_con <- file(fcs_file, "rb")
# Move the file connection pointer to the start of the data segment
data_start <- 3385
seek(file_con, data_start, origin = "start")
## [1] 0
bin <- readBin(file_con, what = "raw", 1000000000) # read it to the end, for later
bin[1:10]
## [1] 41 1f 95 81 41 60 00 00 00 00
The numbers (and letters) you’re seeing are the values of each byte (8 bits, or 8 of the number 0 or 1) stored what is known as hexidecimal, or hex. What is hex? It’s base 16. What does that look like? We’ll count upward in the different numbering systems, starting with binary (which you’ve probably at least seen before), move to decimal (what we are used to), and then to hex.
Here’s binary. 0, 1, 10, 11, 100, 101, 111..
Here’s decimal. 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11…
Here’s hex. 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f, 10, 11…
It turns out that a string of 8 bits (one byte), from 00000000 to 11111111 can be stored in hex as a single number between 0 and 255, or 0 and ff That’s why you see the hex values in pairs of 2. So it’s a convenient way to store it, especially when you’re looking at many strings of 8 bits (many bytes) at the same time.
So let’s read them as numbers, based on the info we have from the header. Data type float corresponds to type “numeric” and size of 4. We specify big endian as well. Now let’s look at the numbers.
data_values <- readBin(con = rawConnection(bin),
what = "numeric",
size = 4,
n = 49000000,
endian = "big",
signed = FALSE)
## Warning in readBin(con = rawConnection(bin), what = "numeric", size = 4, :
## 'signed = FALSE' is only valid for integers of sizes 1 and 2
data_values[1:10]
## [1] 9.974000 14.000000 0.000000 8.036703 0.000000 226.799026
## [7] 0.000000 11.316442 0.000000 0.000000
Ok, so we have numbers that seem to be within the range of CyTOF data. Can we find these respective values in our file? We’ll read it in as a flow frame as a control.
library(flowCore)
ff <- flowCore::read.FCS(fcs_file)
## uneven number of tokens: 525
## The last keyword is dropped.
## uneven number of tokens: 525
## The last keyword is dropped.
exp <- exprs(ff)
head(exp)
## Time Event_length In113Di In115Di Sn120Di I127Di Xe131Di
## [1,] 9.974 14 0.0000000 8.0367031 0.000000 226.799026 0.000000
## [2,] 10.469 24 0.0000000 0.0000000 0.000000 0.000000 0.000000
## [3,] 15.143 16 0.5145568 7.5084190 0.000000 6.662100 0.729067
## [4,] 35.339 16 16.4438553 0.0000000 3.729181 1.179855 0.000000
## [5,] 47.435 17 0.0000000 0.0000000 0.000000 7.391151 0.000000
## [6,] 58.477 17 0.0000000 0.3225487 0.000000 140.800934 0.000000
## Ba138Di Ce140Di Nd142Di Nd143Di Nd144Di Nd146Di Sm147Di
## [1,] 11.316442 0.000 0.0000000 14.0439272 8.326587 0.000000 115.311111
## [2,] 1.706828 3553.608 450.7016602 0.6088966 2.823561 2.354746 0.000000
## [3,] 47.998772 0.000 1.0432056 3.6501691 339.169403 0.000000 0.000000
## [4,] 62.383766 0.000 0.0000000 0.0000000 7.539814 0.000000 9.050664
## [5,] 19.445116 0.000 3.4858465 121.4924698 10.237396 1.924217 0.000000
## [6,] 16.954496 0.000 0.4329208 7.5059586 4.288917 0.000000 116.587341
## Nd148Di Sm149Di Nd150Di Eu151Di Sm152Di Eu153Di
## [1,] 92.4819946 2.6855114 0.000000 118.25839 3.85027409 364.892303
## [2,] 0.0000000 0.9591464 3.309369 2974.36621 2.48620296 3749.466553
## [3,] 0.8043697 0.2971367 178.258209 0.00000 11.93834400 0.000000
## [4,] 0.0000000 3.0236046 0.000000 102.07199 0.01330179 1.039183
## [5,] 0.0000000 0.0000000 131.106201 45.62723 179.25703430 0.000000
## [6,] 87.0736313 0.3768456 0.000000 171.27834 0.72490025 8.707788
## Sm154Di Gd155Di Gd156Di Gd157Di Gd158Di Tb159Di
## [1,] 72.9170074 1.557252 0.0000000 64.4974213 0.1141828 0.0000000
## [2,] 0.5006996 1.562974 35.7167282 0.0000000 4.6698203 0.1269188
## [3,] 0.0000000 1.677551 171.0736694 2.0096011 0.0000000 54.1909065
## [4,] 0.6175294 0.000000 118.2227173 1.2124252 0.3072502 0.0000000
## [5,] 0.3943402 0.000000 0.1050521 0.6307476 0.8010152 5.9334769
## [6,] 14.1738214 0.397190 2.4590282 126.2724686 6.8257523 0.0000000
## Gd160Di Dy162Di Dy164Di Ho165Di Er166Di Er167Di
## [1,] 0.000000 0.09437896 0.0000000 0.00000 159.287155 0.9581637
## [2,] 0.000000 0.00000000 0.0000000 3566.91602 2.497736 2.7491443
## [3,] 6.429448 0.00000000 1.6976601 13.65625 3.002327 176.1483154
## [4,] 0.000000 158.91073608 0.2398935 0.00000 0.000000 0.0000000
## [5,] 101.591438 84.61638641 0.0000000 17.55208 1.347812 112.9404984
## [6,] 0.000000 1.10217929 2.0162458 0.00000 279.289337 12.3298969
## Er168Di Tm169Di Er170Di Yb171Di Yb172Di Yb173Di Yb174Di
## [1,] 0.5559384 0.000000 0.0000000 0.0000000 1.492987 3.4290566 0.82368094
## [2,] 0.8139601 0.000000 0.0000000 0.5873441 0.000000 0.7412859 0.00000000
## [3,] 8.9375687 0.000000 0.6352956 0.0000000 8.507373 0.0000000 0.00000000
## [4,] 0.0000000 0.000000 28.3594170 0.0000000 0.000000 0.0000000 17.49743652
## [5,] 2.3422778 7.284369 0.0000000 0.0000000 0.000000 0.0000000 0.00000000
## [6,] 0.0000000 0.000000 0.0000000 0.3184899 1.066509 8.2221928 0.05851584
## Lu175Di Yb176Di BCKG190Di Ir191Di Ir193Di Pb208Di Center
## [1,] 151.935501 0.9181435 0.0000000 1800.2378 3246.300 0 491.1563
## [2,] 3539.489014 36.7945709 0.0000000 0.0000 0.000 0 710.2842
## [3,] 0.623376 0.0000000 0.6726175 1612.4308 2854.078 0 547.6981
## [4,] 0.000000 0.3830891 3.4977818 927.3303 1805.964 0 550.8316
## [5,] 0.000000 0.0000000 1.1105798 1270.4244 2148.269 0 557.1343
## [6,] 518.293762 10.5734234 0.0000000 1565.4546 2757.719 0 559.6838
## Offset Width Residual
## [1,] 75.38984 30.92957 33.27013
## [2,] 66.95893 45.69018 63.57212
## [3,] 75.96170 35.72036 52.15306
## [4,] 82.48290 39.00098 52.18004
## [5,] 88.54017 44.38193 46.69464
## [6,] 84.81390 42.38083 40.17693
Did you find our stream of numbers? It seems to be reading row by row. I’ll show you.
data_values[1:10]
## [1] 9.974000 14.000000 0.000000 8.036703 0.000000 226.799026
## [7] 0.000000 11.316442 0.000000 0.000000
exp[1,1:10]
## Time Event_length In113Di In115Di Sn120Di I127Di
## 9.974000 14.000000 0.000000 8.036703 0.000000 226.799026
## Xe131Di Ba138Di Ce140Di Nd142Di
## 0.000000 11.316442 0.000000 0.000000
Ok, now for the big challenge. Can we make the expression matrix ourselves? First, we know that the data are being read in row by row. So we need the parameter names. They were stored in the TEXT segment. Let’s pull them out. We have the P, a number, and then a letter. P1N is the name (channel), P1S is the marker name (eg. CD14).
Let’s pull out the PnN’s from the TEXT segment.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks flowCore::filter(), stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
markers <- dat[[1]] %>% str_split("\\\\") %>% .[[1]] %>% .[grep("P\\d+N", .) + 1]
## Warning in grep("P\\d+N", .): unable to translate ' A<95><81>A`' to a wide
## string
## Warning in grep("P\\d+N", .): input string 526 is invalid
markers
## [1] "Time" "Event_length" "In113Di" "In115Di" "Sn120Di"
## [6] "I127Di" "Xe131Di" "Ba138Di" "Ce140Di" "Nd142Di"
## [11] "Nd143Di" "Nd144Di" "Nd146Di" "Sm147Di" "Nd148Di"
## [16] "Sm149Di" "Nd150Di" "Eu151Di" "Sm152Di" "Eu153Di"
## [21] "Sm154Di" "Gd155Di" "Gd156Di" "Gd157Di" "Gd158Di"
## [26] "Tb159Di" "Gd160Di" "Dy162Di" "Dy164Di" "Ho165Di"
## [31] "Er166Di" "Er167Di" "Er168Di" "Tm169Di" "Er170Di"
## [36] "Yb171Di" "Yb172Di" "Yb173Di" "Yb174Di" "Lu175Di"
## [41] "Yb176Di" "BCKG190Di" "Ir191Di" "Ir193Di" "Pb208Di"
## [46] "Center" "Offset" "Width" "Residual"
Ok, now for the values. If each cell is 49 elements, then we can set a matrix for the entire data segment specifying 49 columns. Then we can name it using the markers param that we use fished out.
my_exp <- matrix(data_values, nrow = 49) %>% t()
colnames(my_exp) <- markers
head(my_exp)
## Time Event_length In113Di In115Di Sn120Di I127Di Xe131Di
## [1,] 9.974 14 0.0000000 8.0367031 0.000000 226.799026 0.000000
## [2,] 10.469 24 0.0000000 0.0000000 0.000000 0.000000 0.000000
## [3,] 15.143 16 0.5145568 7.5084190 0.000000 6.662100 0.729067
## [4,] 35.339 16 16.4438553 0.0000000 3.729181 1.179855 0.000000
## [5,] 47.435 17 0.0000000 0.0000000 0.000000 7.391151 0.000000
## [6,] 58.477 17 0.0000000 0.3225487 0.000000 140.800934 0.000000
## Ba138Di Ce140Di Nd142Di Nd143Di Nd144Di Nd146Di Sm147Di
## [1,] 11.316442 0.000 0.0000000 14.0439272 8.326587 0.000000 115.311111
## [2,] 1.706828 3553.608 450.7016602 0.6088966 2.823561 2.354746 0.000000
## [3,] 47.998772 0.000 1.0432056 3.6501691 339.169403 0.000000 0.000000
## [4,] 62.383766 0.000 0.0000000 0.0000000 7.539814 0.000000 9.050664
## [5,] 19.445116 0.000 3.4858465 121.4924698 10.237396 1.924217 0.000000
## [6,] 16.954496 0.000 0.4329208 7.5059586 4.288917 0.000000 116.587341
## Nd148Di Sm149Di Nd150Di Eu151Di Sm152Di Eu153Di
## [1,] 92.4819946 2.6855114 0.000000 118.25839 3.85027409 364.892303
## [2,] 0.0000000 0.9591464 3.309369 2974.36621 2.48620296 3749.466553
## [3,] 0.8043697 0.2971367 178.258209 0.00000 11.93834400 0.000000
## [4,] 0.0000000 3.0236046 0.000000 102.07199 0.01330179 1.039183
## [5,] 0.0000000 0.0000000 131.106201 45.62723 179.25703430 0.000000
## [6,] 87.0736313 0.3768456 0.000000 171.27834 0.72490025 8.707788
## Sm154Di Gd155Di Gd156Di Gd157Di Gd158Di Tb159Di
## [1,] 72.9170074 1.557252 0.0000000 64.4974213 0.1141828 0.0000000
## [2,] 0.5006996 1.562974 35.7167282 0.0000000 4.6698203 0.1269188
## [3,] 0.0000000 1.677551 171.0736694 2.0096011 0.0000000 54.1909065
## [4,] 0.6175294 0.000000 118.2227173 1.2124252 0.3072502 0.0000000
## [5,] 0.3943402 0.000000 0.1050521 0.6307476 0.8010152 5.9334769
## [6,] 14.1738214 0.397190 2.4590282 126.2724686 6.8257523 0.0000000
## Gd160Di Dy162Di Dy164Di Ho165Di Er166Di Er167Di
## [1,] 0.000000 0.09437896 0.0000000 0.00000 159.287155 0.9581637
## [2,] 0.000000 0.00000000 0.0000000 3566.91602 2.497736 2.7491443
## [3,] 6.429448 0.00000000 1.6976601 13.65625 3.002327 176.1483154
## [4,] 0.000000 158.91073608 0.2398935 0.00000 0.000000 0.0000000
## [5,] 101.591438 84.61638641 0.0000000 17.55208 1.347812 112.9404984
## [6,] 0.000000 1.10217929 2.0162458 0.00000 279.289337 12.3298969
## Er168Di Tm169Di Er170Di Yb171Di Yb172Di Yb173Di Yb174Di
## [1,] 0.5559384 0.000000 0.0000000 0.0000000 1.492987 3.4290566 0.82368094
## [2,] 0.8139601 0.000000 0.0000000 0.5873441 0.000000 0.7412859 0.00000000
## [3,] 8.9375687 0.000000 0.6352956 0.0000000 8.507373 0.0000000 0.00000000
## [4,] 0.0000000 0.000000 28.3594170 0.0000000 0.000000 0.0000000 17.49743652
## [5,] 2.3422778 7.284369 0.0000000 0.0000000 0.000000 0.0000000 0.00000000
## [6,] 0.0000000 0.000000 0.0000000 0.3184899 1.066509 8.2221928 0.05851584
## Lu175Di Yb176Di BCKG190Di Ir191Di Ir193Di Pb208Di Center
## [1,] 151.935501 0.9181435 0.0000000 1800.2378 3246.300 0 491.1563
## [2,] 3539.489014 36.7945709 0.0000000 0.0000 0.000 0 710.2842
## [3,] 0.623376 0.0000000 0.6726175 1612.4308 2854.078 0 547.6981
## [4,] 0.000000 0.3830891 3.4977818 927.3303 1805.964 0 550.8316
## [5,] 0.000000 0.0000000 1.1105798 1270.4244 2148.269 0 557.1343
## [6,] 518.293762 10.5734234 0.0000000 1565.4546 2757.719 0 559.6838
## Offset Width Residual
## [1,] 75.38984 30.92957 33.27013
## [2,] 66.95893 45.69018 63.57212
## [3,] 75.96170 35.72036 52.15306
## [4,] 82.48290 39.00098 52.18004
## [5,] 88.54017 44.38193 46.69464
## [6,] 84.81390 42.38083 40.17693
And now for the final test. Does my expression matrix equal the expression matrix of the flow frame we read in using flowCore?
all(exp == my_exp)
## [1] TRUE
Yes! We have successfully pulled out the expression matrix from a fcs file from scratch.
What can you do with all of this? Well, if you have to debug anything from flowCore, because perhaps you have a fcs file from a new machine that might be in some different format, you now have the intuition to do so. You can also write your own fcs parser if you need to solve particular problems that flowCore can’t help you with.
Note that generalizing this to all possible types of fcs files is complicated. I have lots of respect for the contributors of the flowCore package for dealing with all the different conditions and edge cases that comes with the different files from the different machines.
For further reading on the anatomy of a fcs file, please go here.