Home liberachat/#haskell: Logs Calendar

Logs: liberachat/#haskell

←Prev  Next→ 1,804,004 events total
2021-08-06 05:30:33 <jle`> Cajun: what are the Word8s supposed to represent?
2021-08-06 05:30:40 <Cajun> pixels
2021-08-06 05:30:54 <jle`> like color channels?
2021-08-06 05:31:15 <jle`> and this is like a row major representation?
2021-08-06 05:31:20 <Cajun> in the juicypixels library, it uses `Pixel8` which is a synonym for `Word8` so i might as well convert to Word8
2021-08-06 05:31:39 <jle`> i'm just trying to understand how the image is encoded heh
2021-08-06 05:31:56 <jle`> when first looking at the list it seemed like a list of points to connect like an svg maybe heh
2021-08-06 05:32:43 <Cajun> its literally just a list like this: "[(Num, Num, Num), (Num, Num, Num)......]" where each Num is an RGB value and each tuple makes a pixel
2021-08-06 05:32:44 merijn joins (~merijn@83-160-49-249.ip.xs4all.nl)
2021-08-06 05:32:59 <Axman6> I would be inclined to split the reading into smaller chunks - assuming you don't need to be super strict with the format, you could so something like: map read . splitOn "," . drop 1 . init
2021-08-06 05:33:03 oxide joins (~lambda@user/oxide)
2021-08-06 05:33:35 <jle`> Cajun: so, it's like a scan of pixels from left to right, row to row?
2021-08-06 05:33:44 <jle`> the color values at each point?
2021-08-06 05:34:10 <Axman6> it's a lot like a pgm image
2021-08-06 05:34:24 <Cajun> well its essentially one row, but the challenge is to recover it, so i can say how far the rows go jle`
2021-08-06 05:35:04 <Axman6> or ppm I guess
2021-08-06 05:35:08 <Axman6> https://en.wikipedia.org/wiki/Netpbm#File_formats
2021-08-06 05:36:18 <jle`> hm, i guess the challenge is if there's a streaming file format for images
2021-08-06 05:36:21 <jle`> then you can just convert directly into there
2021-08-06 05:36:54 <Cajun> i tried looking for a library that would directly take up a list of tuples and churn out an image, but its appearing to be more difficult
2021-08-06 05:37:11 <Axman6> I feel like you're overthinking this jle`, it's just the Show output of a [(Word8,Word8,Word8)] that needs to be Read
2021-08-06 05:37:16 <Cajun> i can try that strategy of splitting the file, but that just defers the work, no?
2021-08-06 05:37:27 <jle`> Axman6: it can'be be read because it doesn't fit into memory
2021-08-06 05:37:32 <jle`> at least, that's how i'm interpreting it
2021-08-06 05:37:43 <jle`> hm but 189mb should fit in memory
2021-08-06 05:37:48 <Axman6> using lazyIO it should be fine
2021-08-06 05:37:53 <Cajun> well for some reason its also taking up a bunch of memory and i have no idea why
2021-08-06 05:37:54 <Axman6> yeah
2021-08-06 05:38:08 <jle`> if it fits into memory then lazy or non-lazy io shouldn't be an issue either way i think
2021-08-06 05:38:20 <jle`> the only reason you'd want lazy io if you don't want your whole data in memory
2021-08-06 05:38:26 <Axman6> well, you can do some calculations, but remember that (Word8,Word8,Word8) takes up much more space than 3*8 bytes
2021-08-06 05:38:40 <Axman6> there's like two words for the tuple, then two words per Word8
2021-08-06 05:38:43 <Cajun> reading the file isnt an issue, but `read` -ing it to convert the String to (Word8, Word8, Word8) eats all the ram and essentially crashes it
2021-08-06 05:39:00 <jle`> ah yeah, 'read' is not really meant for actual work
2021-08-06 05:39:05 <jle`> it's mostly for debugging
2021-08-06 05:39:17 <Axman6> so, you might want to add: data Pixel = Pixel {-#UNPACK#-}Word8 {-#UNPACK#-}Word8 {-#UNPACK#-}Word8 and convert the tuple into that
2021-08-06 05:39:38 <Axman6> missing the ! on those Word8's
2021-08-06 05:39:58 <Cajun> convert the tuple during the `read` to that?
2021-08-06 05:40:08 <jle`> so using like splitOn "," (like Axman suggested) should work, that's pretty efficient, especially if you do it over lazy io. otherwise you can use a parser library like attoparsec but that'd be overkill too
2021-08-06 05:40:27 <jle`> read is mostly meant for debugging, so i wouldn't use it for anything serious
2021-08-06 05:40:35 <Axman6> yeah - if you have (Word8, Word8, Word8) -> Pixel, then do rdo map (toPixel . read) . ...
2021-08-06 05:40:53 <jle`> > splitOn ", " "(1,2,3), (4,5,6), (7,8,9)"
2021-08-06 05:40:54 <lambdabot> ["(1,2,3)","(4,5,6)","(7,8,9)"]
2021-08-06 05:41:22 <jle`> or depending on your file format you can splitOn "," and chunksOf 3 it or something like that
2021-08-06 05:41:23 <Axman6> % read " (1,2)" :: (Word8,Word8)
2021-08-06 05:41:23 <yahb> Axman6: (1,2)
2021-08-06 05:41:32 <Axman6> splitOn "," should be enough
2021-08-06 05:41:48 <Cajun> and should that be in another `let` right before it?
2021-08-06 05:42:11 <Cajun> like `let splitArr = splitOn "," rawList`
2021-08-06 05:42:48 <Axman6> I would just do it in one go, let imageArr = map (toPixel . read) . splitOn "," . init . drop 1 $ rawList
2021-08-06 05:42:56 × falafel quits (~falafel@pool-96-255-70-50.washdc.fios.verizon.net) (Ping timeout: 272 seconds)
2021-08-06 05:43:08 <Cajun> why do you need the `init . drop 1` in this instance?
2021-08-06 05:43:21 <Axman6> to get rid ot the [ and ]
2021-08-06 05:43:27 <Cajun> yeah just figured, makes sense
2021-08-06 05:43:48 <Axman6> I would definitely not use readFile' either, whatever that is
2021-08-06 05:43:54 × koz quits (~koz@121.99.240.58) (Remote host closed the connection)
2021-08-06 05:43:56 <Axman6> lazy IO will help you here
2021-08-06 05:44:09 koz joins (~koz@121.99.240.58)
2021-08-06 05:44:14 <Axman6> one of the cases where it can actually work, without having to resort to streaming
2021-08-06 05:44:21 <Cajun> so i do want to lazily read the file, not strictly with `readFile'` ?
2021-08-06 05:44:27 <Axman6> yes
2021-08-06 05:44:58 <Axman6> think about the oveahead that 186 million Chars has
2021-08-06 05:45:12 <Axman6> you will need gigabytes of memory to put that in RAM
2021-08-06 05:45:19 <jle`> yeah, the advantage of splitOn, map, etc. is that they process the list char-by-char, they never need anything beyond that
2021-08-06 05:45:44 <jle`> so with lazy io, io is driven by what splitOn, drop, init, map demand
2021-08-06 05:45:47 <Axman6> each (:) is like three words, then each Char is another two words. and a word is 8 bytes
2021-08-06 05:45:48 <jle`> and the demand is piece-by-piece
2021-08-06 05:46:09 <jle`> so you never keep any char's in memory other than what toPixel/read/splitOn are directly processing
2021-08-06 05:46:10 <Cajun> what library is `splitOn` from? it doesnt seem to be in the prelude
2021-08-06 05:46:20 <jle`> should be in base in Data.List i think
2021-08-06 05:46:36 <jle`> oh, it's not :o
2021-08-06 05:46:53 <Cajun> hoogle says it exists for the `Text` datatype but not for strings
2021-08-06 05:46:59 <jle`> looks like it's in the 'split' library
2021-08-06 05:47:04 <jle`> which is pretty commonly used
2021-08-06 05:47:57 <Axman6> we need to just merge all of splity into base already -_-
2021-08-06 05:48:02 <Axman6> split*
2021-08-06 05:48:04 <jle`> +1
2021-08-06 05:48:33 <jle`> but ì you're learning haskell, it's actually a neat exercise to write it from scratch too. on an unrelated note :)
2021-08-06 05:48:43 <Axman6> agreed
2021-08-06 05:49:13 <jle`> *if
2021-08-06 05:49:21 <Axman6> splitOn :: [a] -> [a] -> [[a]] is significant;y more difficult for a beginner than splitOn :: a -> [a] -> [[a]] too
2021-08-06 05:49:34 <Cajun> and what is `toPixel` doing in that instance Axman6 ?
2021-08-06 05:50:06 <Axman6> literally just putting the Word8's into the Pixel constructor, but it's a much more compact representation than the tuple
2021-08-06 05:50:24 <Axman6> toPixel (r,g,b) = Pixel r g b
2021-08-06 05:50:41 <Axman6> if you're using juicypixels, this type probably already exists
2021-08-06 05:51:17 <Axman6> https://hackage.haskell.org/package/JuicyPixels-3.3.5/docs/Codec-Picture.html#t:PixelRGB8
2021-08-06 05:51:18 <jle`> i feel like you also need to account for the three-ness of the pixels
2021-08-06 05:51:33 <Axman6> jle`: ?
2021-08-06 05:51:40 <Cajun> i have a type synonym for something similar: `type RGB8 = (Pixel8, Pixel8, Pixel8)`
2021-08-06 05:51:54 <jle`> > splitAt "," . drop 1 . init $ "[(233,173,20), (200, 10, 155)]"
2021-08-06 05:51:56 <lambdabot> error:
2021-08-06 05:51:56 <lambdabot> • Couldn't match expected type ‘Int’ with actual type ‘[Char]’
2021-08-06 05:51:56 <lambdabot> • In the first argument of ‘splitAt’, namely ‘","’
2021-08-06 05:52:07 <jle`> > splitOn "," . drop 1 . init $ "[(233,173,20), (200, 10, 155)]"
2021-08-06 05:52:09 <lambdabot> ["(233","173","20)"," (200"," 10"," 155)"]
2021-08-06 05:52:17 <jle`> you get each number instead of each triple
2021-08-06 05:52:33 <Axman6> aren't you just handing this to juicypixels eventually? why not just go straight to its type? (Word8,Word8,Word8) is horrifically inefficient
2021-08-06 05:53:01 <Cajun> well what isnt shown in that code segment is handing it off to Repa then to Juicepixels
2021-08-06 05:53:24 <jle`> if you're just getting the raw Word8's then no need to do anything fancy i think
2021-08-06 05:53:30 <Axman6> 3*2 + 4 words, so 10 words, which is 80 bytes, per pixel.

All times are in UTC.