Logs: liberachat/#haskell
| 2021-08-06 05:30:33 | <jle`> | Cajun: what are the Word8s supposed to represent? |
| 2021-08-06 05:30:40 | <Cajun> | pixels |
| 2021-08-06 05:30:54 | <jle`> | like color channels? |
| 2021-08-06 05:31:15 | <jle`> | and this is like a row major representation? |
| 2021-08-06 05:31:20 | <Cajun> | in the juicypixels library, it uses `Pixel8` which is a synonym for `Word8` so i might as well convert to Word8 |
| 2021-08-06 05:31:39 | <jle`> | i'm just trying to understand how the image is encoded heh |
| 2021-08-06 05:31:56 | <jle`> | when first looking at the list it seemed like a list of points to connect like an svg maybe heh |
| 2021-08-06 05:32:43 | <Cajun> | its literally just a list like this: "[(Num, Num, Num), (Num, Num, Num)......]" where each Num is an RGB value and each tuple makes a pixel |
| 2021-08-06 05:32:44 | → | merijn joins (~merijn@83-160-49-249.ip.xs4all.nl) |
| 2021-08-06 05:32:59 | <Axman6> | I would be inclined to split the reading into smaller chunks - assuming you don't need to be super strict with the format, you could so something like: map read . splitOn "," . drop 1 . init |
| 2021-08-06 05:33:03 | → | oxide joins (~lambda@user/oxide) |
| 2021-08-06 05:33:35 | <jle`> | Cajun: so, it's like a scan of pixels from left to right, row to row? |
| 2021-08-06 05:33:44 | <jle`> | the color values at each point? |
| 2021-08-06 05:34:10 | <Axman6> | it's a lot like a pgm image |
| 2021-08-06 05:34:24 | <Cajun> | well its essentially one row, but the challenge is to recover it, so i can say how far the rows go jle` |
| 2021-08-06 05:35:04 | <Axman6> | or ppm I guess |
| 2021-08-06 05:35:08 | <Axman6> | https://en.wikipedia.org/wiki/Netpbm#File_formats |
| 2021-08-06 05:36:18 | <jle`> | hm, i guess the challenge is if there's a streaming file format for images |
| 2021-08-06 05:36:21 | <jle`> | then you can just convert directly into there |
| 2021-08-06 05:36:54 | <Cajun> | i tried looking for a library that would directly take up a list of tuples and churn out an image, but its appearing to be more difficult |
| 2021-08-06 05:37:11 | <Axman6> | I feel like you're overthinking this jle`, it's just the Show output of a [(Word8,Word8,Word8)] that needs to be Read |
| 2021-08-06 05:37:16 | <Cajun> | i can try that strategy of splitting the file, but that just defers the work, no? |
| 2021-08-06 05:37:27 | <jle`> | Axman6: it can'be be read because it doesn't fit into memory |
| 2021-08-06 05:37:32 | <jle`> | at least, that's how i'm interpreting it |
| 2021-08-06 05:37:43 | <jle`> | hm but 189mb should fit in memory |
| 2021-08-06 05:37:48 | <Axman6> | using lazyIO it should be fine |
| 2021-08-06 05:37:53 | <Cajun> | well for some reason its also taking up a bunch of memory and i have no idea why |
| 2021-08-06 05:37:54 | <Axman6> | yeah |
| 2021-08-06 05:38:08 | <jle`> | if it fits into memory then lazy or non-lazy io shouldn't be an issue either way i think |
| 2021-08-06 05:38:20 | <jle`> | the only reason you'd want lazy io if you don't want your whole data in memory |
| 2021-08-06 05:38:26 | <Axman6> | well, you can do some calculations, but remember that (Word8,Word8,Word8) takes up much more space than 3*8 bytes |
| 2021-08-06 05:38:40 | <Axman6> | there's like two words for the tuple, then two words per Word8 |
| 2021-08-06 05:38:43 | <Cajun> | reading the file isnt an issue, but `read` -ing it to convert the String to (Word8, Word8, Word8) eats all the ram and essentially crashes it |
| 2021-08-06 05:39:00 | <jle`> | ah yeah, 'read' is not really meant for actual work |
| 2021-08-06 05:39:05 | <jle`> | it's mostly for debugging |
| 2021-08-06 05:39:17 | <Axman6> | so, you might want to add: data Pixel = Pixel {-#UNPACK#-}Word8 {-#UNPACK#-}Word8 {-#UNPACK#-}Word8 and convert the tuple into that |
| 2021-08-06 05:39:38 | <Axman6> | missing the ! on those Word8's |
| 2021-08-06 05:39:58 | <Cajun> | convert the tuple during the `read` to that? |
| 2021-08-06 05:40:08 | <jle`> | so using like splitOn "," (like Axman suggested) should work, that's pretty efficient, especially if you do it over lazy io. otherwise you can use a parser library like attoparsec but that'd be overkill too |
| 2021-08-06 05:40:27 | <jle`> | read is mostly meant for debugging, so i wouldn't use it for anything serious |
| 2021-08-06 05:40:35 | <Axman6> | yeah - if you have (Word8, Word8, Word8) -> Pixel, then do rdo map (toPixel . read) . ... |
| 2021-08-06 05:40:53 | <jle`> | > splitOn ", " "(1,2,3), (4,5,6), (7,8,9)" |
| 2021-08-06 05:40:54 | <lambdabot> | ["(1,2,3)","(4,5,6)","(7,8,9)"] |
| 2021-08-06 05:41:22 | <jle`> | or depending on your file format you can splitOn "," and chunksOf 3 it or something like that |
| 2021-08-06 05:41:23 | <Axman6> | % read " (1,2)" :: (Word8,Word8) |
| 2021-08-06 05:41:23 | <yahb> | Axman6: (1,2) |
| 2021-08-06 05:41:32 | <Axman6> | splitOn "," should be enough |
| 2021-08-06 05:41:48 | <Cajun> | and should that be in another `let` right before it? |
| 2021-08-06 05:42:11 | <Cajun> | like `let splitArr = splitOn "," rawList` |
| 2021-08-06 05:42:48 | <Axman6> | I would just do it in one go, let imageArr = map (toPixel . read) . splitOn "," . init . drop 1 $ rawList |
| 2021-08-06 05:42:56 | × | falafel quits (~falafel@pool-96-255-70-50.washdc.fios.verizon.net) (Ping timeout: 272 seconds) |
| 2021-08-06 05:43:08 | <Cajun> | why do you need the `init . drop 1` in this instance? |
| 2021-08-06 05:43:21 | <Axman6> | to get rid ot the [ and ] |
| 2021-08-06 05:43:27 | <Cajun> | yeah just figured, makes sense |
| 2021-08-06 05:43:48 | <Axman6> | I would definitely not use readFile' either, whatever that is |
| 2021-08-06 05:43:54 | × | koz quits (~koz@121.99.240.58) (Remote host closed the connection) |
| 2021-08-06 05:43:56 | <Axman6> | lazy IO will help you here |
| 2021-08-06 05:44:09 | → | koz joins (~koz@121.99.240.58) |
| 2021-08-06 05:44:14 | <Axman6> | one of the cases where it can actually work, without having to resort to streaming |
| 2021-08-06 05:44:21 | <Cajun> | so i do want to lazily read the file, not strictly with `readFile'` ? |
| 2021-08-06 05:44:27 | <Axman6> | yes |
| 2021-08-06 05:44:58 | <Axman6> | think about the oveahead that 186 million Chars has |
| 2021-08-06 05:45:12 | <Axman6> | you will need gigabytes of memory to put that in RAM |
| 2021-08-06 05:45:19 | <jle`> | yeah, the advantage of splitOn, map, etc. is that they process the list char-by-char, they never need anything beyond that |
| 2021-08-06 05:45:44 | <jle`> | so with lazy io, io is driven by what splitOn, drop, init, map demand |
| 2021-08-06 05:45:47 | <Axman6> | each (:) is like three words, then each Char is another two words. and a word is 8 bytes |
| 2021-08-06 05:45:48 | <jle`> | and the demand is piece-by-piece |
| 2021-08-06 05:46:09 | <jle`> | so you never keep any char's in memory other than what toPixel/read/splitOn are directly processing |
| 2021-08-06 05:46:10 | <Cajun> | what library is `splitOn` from? it doesnt seem to be in the prelude |
| 2021-08-06 05:46:20 | <jle`> | should be in base in Data.List i think |
| 2021-08-06 05:46:36 | <jle`> | oh, it's not :o |
| 2021-08-06 05:46:53 | <Cajun> | hoogle says it exists for the `Text` datatype but not for strings |
| 2021-08-06 05:46:59 | <jle`> | looks like it's in the 'split' library |
| 2021-08-06 05:47:04 | <jle`> | which is pretty commonly used |
| 2021-08-06 05:47:57 | <Axman6> | we need to just merge all of splity into base already -_- |
| 2021-08-06 05:48:02 | <Axman6> | split* |
| 2021-08-06 05:48:04 | <jle`> | +1 |
| 2021-08-06 05:48:33 | <jle`> | but ì you're learning haskell, it's actually a neat exercise to write it from scratch too. on an unrelated note :) |
| 2021-08-06 05:48:43 | <Axman6> | agreed |
| 2021-08-06 05:49:13 | <jle`> | *if |
| 2021-08-06 05:49:21 | <Axman6> | splitOn :: [a] -> [a] -> [[a]] is significant;y more difficult for a beginner than splitOn :: a -> [a] -> [[a]] too |
| 2021-08-06 05:49:34 | <Cajun> | and what is `toPixel` doing in that instance Axman6 ? |
| 2021-08-06 05:50:06 | <Axman6> | literally just putting the Word8's into the Pixel constructor, but it's a much more compact representation than the tuple |
| 2021-08-06 05:50:24 | <Axman6> | toPixel (r,g,b) = Pixel r g b |
| 2021-08-06 05:50:41 | <Axman6> | if you're using juicypixels, this type probably already exists |
| 2021-08-06 05:51:17 | <Axman6> | https://hackage.haskell.org/package/JuicyPixels-3.3.5/docs/Codec-Picture.html#t:PixelRGB8 |
| 2021-08-06 05:51:18 | <jle`> | i feel like you also need to account for the three-ness of the pixels |
| 2021-08-06 05:51:33 | <Axman6> | jle`: ? |
| 2021-08-06 05:51:40 | <Cajun> | i have a type synonym for something similar: `type RGB8 = (Pixel8, Pixel8, Pixel8)` |
| 2021-08-06 05:51:54 | <jle`> | > splitAt "," . drop 1 . init $ "[(233,173,20), (200, 10, 155)]" |
| 2021-08-06 05:51:56 | <lambdabot> | error: |
| 2021-08-06 05:51:56 | <lambdabot> | • Couldn't match expected type ‘Int’ with actual type ‘[Char]’ |
| 2021-08-06 05:51:56 | <lambdabot> | • In the first argument of ‘splitAt’, namely ‘","’ |
| 2021-08-06 05:52:07 | <jle`> | > splitOn "," . drop 1 . init $ "[(233,173,20), (200, 10, 155)]" |
| 2021-08-06 05:52:09 | <lambdabot> | ["(233","173","20)"," (200"," 10"," 155)"] |
| 2021-08-06 05:52:17 | <jle`> | you get each number instead of each triple |
| 2021-08-06 05:52:33 | <Axman6> | aren't you just handing this to juicypixels eventually? why not just go straight to its type? (Word8,Word8,Word8) is horrifically inefficient |
| 2021-08-06 05:53:01 | <Cajun> | well what isnt shown in that code segment is handing it off to Repa then to Juicepixels |
| 2021-08-06 05:53:24 | <jle`> | if you're just getting the raw Word8's then no need to do anything fancy i think |
| 2021-08-06 05:53:30 | <Axman6> | 3*2 + 4 words, so 10 words, which is 80 bytes, per pixel. |
All times are in UTC.