Logs: freenode/#haskell
| 2021-04-06 15:22:43 | × | steerio quits (~steerio@aviv.kinneret.de) (Quit: leaving) |
| 2021-04-06 15:22:55 | × | whataday quits (~xxx@2400:8902::f03c:92ff:fe60:98d8) (Remote host closed the connection) |
| 2021-04-06 15:24:02 | → | whataday joins (~xxx@2400:8902::f03c:92ff:fe60:98d8) |
| 2021-04-06 15:24:55 | × | whataday quits (~xxx@2400:8902::f03c:92ff:fe60:98d8) (Remote host closed the connection) |
| 2021-04-06 15:26:02 | → | whataday joins (~xxx@2400:8902::f03c:92ff:fe60:98d8) |
| 2021-04-06 15:31:15 | × | merijn quits (~merijn@83-160-49-249.ip.xs4all.nl) (Ping timeout: 260 seconds) |
| 2021-04-06 15:31:44 | × | LKoen quits (~LKoen@65.250.88.92.rev.sfr.net) (Quit: “It’s only logical. First you learn to talk, then you learn to think. Too bad it’s not the other way round.”) |
| 2021-04-06 15:33:24 | → | ulidtko joins (~ulidtko@194.54.80.38) |
| 2021-04-06 15:39:10 | → | heatsink joins (~heatsink@108-201-191-115.lightspeed.sntcca.sbcglobal.net) |
| 2021-04-06 15:41:11 | × | dcoutts quits (~duncan@94.186.125.91.dyn.plus.net) (Ping timeout: 240 seconds) |
| 2021-04-06 15:41:50 | × | mettekou quits (~mettekou@d8D875214.access.telenet.be) (Quit: Leaving) |
| 2021-04-06 15:44:11 | → | z0 joins (~zzz@2001:8a0:de1b:9601:85ea:cf54:e2e4:7d25) |
| 2021-04-06 15:44:28 | → | royal_screwup21 joins (52254809@gateway/web/cgi-irc/kiwiirc.com/ip.82.37.72.9) |
| 2021-04-06 15:44:40 | → | hiptobecubic joins (~john@unaffiliated/hiptobecubic) |
| 2021-04-06 15:48:04 | → | barthandelous joins (~calebbrze@2600:1007:b0a1:3aa7:b579:e4ea:b055:38a9) |
| 2021-04-06 15:48:12 | → | LKoen joins (~LKoen@65.250.88.92.rev.sfr.net) |
| 2021-04-06 15:49:18 | → | esp32_prog joins (~esp32_pro@91.193.4.202) |
| 2021-04-06 15:49:27 | × | royal_screwup21 quits (52254809@gateway/web/cgi-irc/kiwiirc.com/ip.82.37.72.9) (Ping timeout: 260 seconds) |
| 2021-04-06 15:50:01 | × | alx741 quits (~alx741@181.196.68.238) (Ping timeout: 260 seconds) |
| 2021-04-06 15:52:12 | × | barthandelous quits (~calebbrze@2600:1007:b0a1:3aa7:b579:e4ea:b055:38a9) (Ping timeout: 246 seconds) |
| 2021-04-06 15:53:10 | → | Wuzzy joins (~Wuzzy@p5790e46d.dip0.t-ipconnect.de) |
| 2021-04-06 15:55:44 | → | ukari joins (~ukari@unaffiliated/ukari) |
| 2021-04-06 15:56:30 | → | hypercube joins (hypercube@gateway/vpn/protonvpn/hypercube) |
| 2021-04-06 15:57:46 | → | gtk joins (~user@roc37-h01-176-170-197-243.dsl.sta.abo.bbox.fr) |
| 2021-04-06 15:58:00 | → | zyklotomic joins (~ethan@unaffiliated/chocopuff) |
| 2021-04-06 15:58:20 | × | z0 quits (~zzz@2001:8a0:de1b:9601:85ea:cf54:e2e4:7d25) (Quit: z0) |
| 2021-04-06 15:58:54 | <dmj`> | maerwald: nix engineer is a thing now? |
| 2021-04-06 15:59:08 | <maerwald> | dmj`: sure |
| 2021-04-06 16:00:01 | <monochrom> | I would s/engineer/admin/ but it is a millenial trend to call every position "engineer". |
| 2021-04-06 16:00:31 | × | stree quits (~stree@68.36.8.116) (Ping timeout: 260 seconds) |
| 2021-04-06 16:00:36 | <aldum> | imo better than the gen X trend of calling every position a manager |
| 2021-04-06 16:00:46 | × | esp32_prog quits (~esp32_pro@91.193.4.202) (Ping timeout: 240 seconds) |
| 2021-04-06 16:01:02 | <aldum> | or at least not worse |
| 2021-04-06 16:01:42 | <dmj`> | bash engineer |
| 2021-04-06 16:02:11 | × | gtk quits (~user@roc37-h01-176-170-197-243.dsl.sta.abo.bbox.fr) (Ping timeout: 240 seconds) |
| 2021-04-06 16:02:17 | <monochrom> | If you answer emails for an elected politician you can be called "legislator's social media engineer" |
| 2021-04-06 16:03:13 | <monochrom> | If you were a spin doctor for said politician, you are now a spin engineer. |
| 2021-04-06 16:03:15 | → | tzh joins (~tzh@c-24-21-73-154.hsd1.wa.comcast.net) |
| 2021-04-06 16:03:31 | <aldum> | spin epidemiologist these days :V |
| 2021-04-06 16:03:41 | <maerwald> | is this supposed to "leak" 1GB? https://paste.tomsmeding.com/WbAcQ3rq |
| 2021-04-06 16:03:51 | → | ToastInTheMachin joins (~ToastInTh@172.93.177.196) |
| 2021-04-06 16:04:05 | × | molehillish quits (~molehilli@2600:8800:8d06:1800:b54a:36bf:7632:87f4) (Ping timeout: 252 seconds) |
| 2021-04-06 16:04:49 | → | merijn joins (~merijn@83-160-49-249.ip.xs4all.nl) |
| 2021-04-06 16:05:42 | → | tomboy64 joins (~tomboy64@unaffiliated/tomboy64) |
| 2021-04-06 16:06:39 | × | ToastInTheMachin quits (~ToastInTh@172.93.177.196) (Read error: Connection reset by peer) |
| 2021-04-06 16:07:19 | <dmj`> | maerwald: yes |
| 2021-04-06 16:07:32 | <dmj`> | maerwald: you're not incrementally processing the list |
| 2021-04-06 16:07:46 | <dmj`> | maerwald: the sink is your writeFile, but you're keeping the list in memory with null check |
| 2021-04-06 16:08:01 | <maerwald> | dmj`: but there's no chunk overlap |
| 2021-04-06 16:08:40 | <dmj`> | maerwald: it's still a list, it's same problem with avg xs = sum xs / fromIntegral (length xs) |
| 2021-04-06 16:08:56 | × | whataday quits (~xxx@2400:8902::f03c:92ff:fe60:98d8) (Remote host closed the connection) |
| 2021-04-06 16:09:00 | <dmj`> | the second you have to take two passes on a list, you're can't incrementally process it |
| 2021-04-06 16:09:21 | <dmj`> | you* |
| 2021-04-06 16:09:22 | → | Rudd0 joins (~Rudd0@185.189.115.108) |
| 2021-04-06 16:09:42 | → | bahamas joins (~lucian@unaffiliated/bahamas) |
| 2021-04-06 16:09:59 | × | merijn quits (~merijn@83-160-49-249.ip.xs4all.nl) (Ping timeout: 260 seconds) |
| 2021-04-06 16:10:02 | → | whataday joins (~xxx@2400:8902::f03c:92ff:fe60:98d8) |
| 2021-04-06 16:10:35 | <maerwald> | is there a way to make this work with bytestring only? |
| 2021-04-06 16:11:00 | <tomsmeding> | maerwald: let's apply fusion terminologies to this; to remove the sharing in that code, you need vertical fusion to merge the 'take' and 'writeFile' loops over the buffer into, say, loop 1, and again vertical fusion to merge the 'drop' and 'null' loops into loop 2; then you can use horizontal fusion to merge loop 1 and loop 2 into a single loop. At this point there is only one consumer, which can |
| 2021-04-06 16:11:00 | <tomsmeding> | then merge using vertical fusion with 'readFile' to produce constant memory usage |
| 2021-04-06 16:11:17 | <tomsmeding> | I suspect that the horizontal fusion is the thing that's not being performed because there's explicit monadic sequencing between the writeFile and the null |
| 2021-04-06 16:11:17 | <shapr> | I want to be a nix engineer! |
| 2021-04-06 16:11:33 | → | fendor_ joins (~fendor@91.141.3.10.wireless.dyn.drei.com) |
| 2021-04-06 16:12:13 | <tomsmeding> | intuition for vertical vs horizontal fusion: make a data flow graph of your code where nodes are operations (take/drop/writeFile etc), and arrows generally point downwards |
| 2021-04-06 16:12:51 | <tomsmeding> | vertical fusion is merging two directly vertically connected nodes into one; horizontal mutatus mutandis |
| 2021-04-06 16:12:55 | <tomsmeding> | *mutatis |
| 2021-04-06 16:13:26 | → | stree joins (~stree@68.36.8.116) |
| 2021-04-06 16:13:46 | × | fendor quits (~fendor@91.141.0.13.wireless.dyn.drei.com) (Ping timeout: 240 seconds) |
| 2021-04-06 16:14:01 | → | gitgood joins (~gitgood@80-44-12-39.dynamic.dsl.as9105.com) |
| 2021-04-06 16:15:49 | <tomsmeding> | in fact, in dmj` 's avg example, it's again horizontal fusion of the 'sum' and 'length' loops that could make it constant-memory; not sure if ghc does that, but there are other compilers for other languages that do |
| 2021-04-06 16:16:21 | × | jpds quits (~jpds@gateway/tor-sasl/jpds) (Ping timeout: 240 seconds) |
| 2021-04-06 16:17:17 | <dmj`> | streaming is about keeping a predictable amount of bytes in memory at any given time, laziness does this for you, for free, but it only works if nobody else is sharing your thunk, the second someone else shares it, you just accumulate in memory instead of incrementally processing |
| 2021-04-06 16:17:51 | <dmj`> | sum and length both share `xs`, sum can't process it incrementally, because length is demanding it |
| 2021-04-06 16:18:08 | <dmj`> | it's the same with writeFile and null in maerwald's example |
| 2021-04-06 16:18:13 | <maerwald> | is there an unsafe way to work around it? |
| 2021-04-06 16:18:47 | <tomsmeding> | dmj`: indeed, unless the compiler can see the definitions of 'sum' and 'length' and fuse their loops together into one pass over the list; again I don't know if ghc does this, but it's theoretically possible |
| 2021-04-06 16:19:01 | <dmj`> | tomsmeding: I haven't been able to get ghc to do horizontal fusion of sum and length, code had to become uncurry (/) $ foldl' (\(x,y) n -> (x + n, y + 1)) (0,0) xs |
| 2021-04-06 16:19:13 | <tomsmeding> | maerwald: wild guess: does it help if you use BSL.splitAt instead of take/drop? Small chance |
| 2021-04-06 16:19:24 | → | Sornaensis joins (~Sornaensi@077213203030.dynamic.telenor.dk) |
| 2021-04-06 16:19:42 | <tomsmeding> | dmj`: yeah there you manually performed horizontal fusion :) Too bad about ghc |
| 2021-04-06 16:19:48 | <dmj`> | tomsmeding: GHCs optimizer won't do that I believe, and vector fusion relies on rewrite rules that don't always fire |
| 2021-04-06 16:19:50 | <tomsmeding> | it's very difficult to do in general though |
| 2021-04-06 16:19:52 | × | notzmv quits (~zmv@unaffiliated/zmv) (Ping timeout: 265 seconds) |
| 2021-04-06 16:19:55 | <maerwald> | tomsmeding: I thought that too, but no, splitAt can't do it |
| 2021-04-06 16:20:01 | <tomsmeding> | sad |
| 2021-04-06 16:20:38 | <dmj`> | tomsmeding: GHC only optimizes across a single module at a time as well (blind JMP problem), those functions are defined in a different module, GHC would need to have the entire program in memory to do that. |
| 2021-04-06 16:20:47 | → | jpds joins (~jpds@gateway/tor-sasl/jpds) |
| 2021-04-06 16:21:03 | <dmj`> | maerwald: delete the "rest" and the "null" |
| 2021-04-06 16:21:10 | × | cr3 quits (~cr3@192-222-143-195.qc.cable.ebox.net) (Quit: leaving) |
| 2021-04-06 16:21:31 | <maerwald> | dmj`: this is a minimal example... can't be done in the real code |
| 2021-04-06 16:21:54 | <maerwald> | there, rest is part of an unfoldr |
| 2021-04-06 16:22:15 | <maerwald> | and at every step it will leak `content` |
| 2021-04-06 16:22:21 | <tomsmeding> | dmj`: surely sum and length are inlined in ghc :p |
| 2021-04-06 16:22:53 | <maerwald> | I think it's time to get a drink and be sad |
| 2021-04-06 16:23:00 | <tomsmeding> | dmj`: 1. that's what {-# INLINE abc #-} is for, and 2. ghc does that for small functions anyway |
| 2021-04-06 16:23:31 | × | Sorny quits (~Sornaensi@79.142.232.102.static.router4.bolignet.dk) (Ping timeout: 260 seconds) |
| 2021-04-06 16:23:44 | <tomsmeding> | maerwald: if bytestring had offered a read function that only reads a prefix of a file (i.e. 1GB), then you could split the readFile in two; that would work |
| 2021-04-06 16:23:51 | <tomsmeding> | but that probably doesn't work in the full tar case either |
| 2021-04-06 16:24:44 | <maerwald> | I'll just rewrite that logic with streamly |
| 2021-04-06 16:24:58 | → | Sorna joins (~Sornaensi@79.142.232.102.static.router4.bolignet.dk) |
All times are in UTC.