Logs: freenode/#haskell
| 2021-04-12 00:18:53 | → | ulfryk joins (~ulfryk@2a01:4b00:872d:e600:a55a:b8e3:54cc:d8d6) |
| 2021-04-12 00:21:33 | × | myShoggoth quits (~myShoggot@75.164.73.93) (Ping timeout: 240 seconds) |
| 2021-04-12 00:22:59 | → | wroathe joins (~wroathe@c-68-54-25-135.hsd1.mn.comcast.net) |
| 2021-04-12 00:23:12 | × | Sgeo quits (~Sgeo@ool-18b98aa4.dyn.optonline.net) (Ping timeout: 240 seconds) |
| 2021-04-12 00:23:27 | × | ulfryk quits (~ulfryk@2a01:4b00:872d:e600:a55a:b8e3:54cc:d8d6) (Ping timeout: 260 seconds) |
| 2021-04-12 00:23:36 | → | Sgeo joins (~Sgeo@ool-18b98aa4.dyn.optonline.net) |
| 2021-04-12 00:24:12 | × | star_cloud quits (~star_clou@ec2-34-220-44-120.us-west-2.compute.amazonaws.com) (Ping timeout: 240 seconds) |
| 2021-04-12 00:25:01 | → | zyeri joins (zyeri@gateway/shell/tilde.team/x-worsvflxuunnsvnw) |
| 2021-04-12 00:25:01 | × | zyeri quits (zyeri@gateway/shell/tilde.team/x-worsvflxuunnsvnw) (Changing host) |
| 2021-04-12 00:25:01 | → | zyeri joins (zyeri@tilde.team/users/zyeri) |
| 2021-04-12 00:26:21 | → | justanotheruser joins (~justanoth@unaffiliated/justanotheruser) |
| 2021-04-12 00:26:38 | × | quinn quits (~quinn@c-73-223-224-163.hsd1.ca.comcast.net) (Ping timeout: 240 seconds) |
| 2021-04-12 00:27:28 | × | wroathe quits (~wroathe@c-68-54-25-135.hsd1.mn.comcast.net) (Ping timeout: 252 seconds) |
| 2021-04-12 00:28:44 | <koz_> | d34df00d: What's your question(s)? |
| 2021-04-12 00:28:45 | × | Tario quits (~Tario@200.119.187.163) (Read error: Connection reset by peer) |
| 2021-04-12 00:31:36 | → | quinn joins (~quinn@c-73-223-224-163.hsd1.ca.comcast.net) |
| 2021-04-12 00:34:18 | → | Tario joins (~Tario@201.192.165.173) |
| 2021-04-12 00:38:36 | × | tmciver quits (~tmciver@cpe-172-101-40-226.maine.res.rr.com) (Ping timeout: 260 seconds) |
| 2021-04-12 00:40:20 | → | tmciver joins (~tmciver@cpe-172-101-40-226.maine.res.rr.com) |
| 2021-04-12 00:40:36 | → | cloudpip joins (sid67735@gateway/web/irccloud.com/x-lqqwgjfhbduhzygo) |
| 2021-04-12 00:41:09 | × | acidjnk_new quits (~acidjnk@p200300d0c72b950365222184c91f1222.dip0.t-ipconnect.de) (Ping timeout: 250 seconds) |
| 2021-04-12 00:41:56 | <cloudpip> | hi all, I'm trying to build a recompile-and-run-loop with ghc, so it'll compile your code and run the main function in a loop so for example in an interactive program, you can close the program and it'll recompile the sources that changed and restart main |
| 2021-04-12 00:42:07 | × | abhixec quits (~abhixec@c-67-169-139-16.hsd1.ca.comcast.net) (Remote host closed the connection) |
| 2021-04-12 00:42:17 | <cloudpip> | https://github.com/homectl/workspace/blob/main/livecoding/src/Debug/LiveCoding.hs <- it works, with 2 caveats I'd like to resolve |
| 2021-04-12 00:44:00 | <cloudpip> | 1) most importantly, I want it to compile to object code, like ghci -fobject-code. this clearly is possible (since ghci -fobject-code works), but when I set it to HscAsm, it no longer reloads the modules even though it does recompile them |
| 2021-04-12 00:44:14 | <cloudpip> | HscAsm does work, but reloading the modules does not |
| 2021-04-12 00:45:55 | <cloudpip> | 2) adding -hide-all-packages (via Opt_HideAllPackages) makes it crash: https://www.irccloud.com/pastebin/SL8YmJSc/ |
| 2021-04-12 00:46:36 | × | vicfred quits (~vicfred@unaffiliated/vicfred) (Quit: Leaving) |
| 2021-04-12 00:49:38 | → | jamm_ joins (~jamm@unaffiliated/jamm) |
| 2021-04-12 00:53:52 | × | jamm_ quits (~jamm@unaffiliated/jamm) (Ping timeout: 258 seconds) |
| 2021-04-12 00:56:02 | × | ViCi quits (daniel@10PLM.ro) (Quit: Quit!) |
| 2021-04-12 00:56:40 | → | abhixec joins (~abhixec@c-67-169-139-16.hsd1.ca.comcast.net) |
| 2021-04-12 00:57:40 | <wrunt> | cloudpip: maybe you can find a clue in the implementation of Dyre, since it does run-time compilation? (https://github.com/willdonnelly/dyre) |
| 2021-04-12 00:58:17 | <cloudpip> | I'm staring at GHCi.UI and I don't see what I'm doing differently |
| 2021-04-12 00:58:24 | <cloudpip> | it looks exactly the same to me |
| 2021-04-12 00:59:23 | <cloudpip> | https://github.com/willdonnelly/dyre/blob/master/Config/Dyre/Compile.hs#L73 |
| 2021-04-12 00:59:27 | <cloudpip> | dyre seems to just call ghc? |
| 2021-04-12 01:00:10 | <cloudpip> | I specifically don't want to call ghc, because I don't want to wait 1 minute for ld to do the executable linking |
| 2021-04-12 01:00:22 | <cloudpip> | ghci in-memory linking is super fast, so I want that |
| 2021-04-12 01:01:01 | <cloudpip> | I looked at hint, it only does HscInterpreted |
| 2021-04-12 01:03:45 | <cloudpip> | now I'm looking at "plugins", which does in fact do loading of .o files, but it's more low level than what I need.. still, it might be useful (though it doesn't work on windows) |
| 2021-04-12 01:05:44 | → | ulfryk joins (~ulfryk@2a01:4b00:872d:e600:a55a:b8e3:54cc:d8d6) |
| 2021-04-12 01:05:47 | → | vicfred joins (~vicfred@unaffiliated/vicfred) |
| 2021-04-12 01:05:58 | → | GZJ0X_ joins (~gzj@unaffiliated/gzj) |
| 2021-04-12 01:07:18 | × | ulfryk quits (~ulfryk@2a01:4b00:872d:e600:a55a:b8e3:54cc:d8d6) (Remote host closed the connection) |
| 2021-04-12 01:08:29 | × | Tuplanolla quits (~Tuplanoll@91-159-68-239.elisa-laajakaista.fi) (Quit: Leaving.) |
| 2021-04-12 01:09:49 | × | gzj quits (~gzj@unaffiliated/gzj) (Ping timeout: 252 seconds) |
| 2021-04-12 01:10:31 | → | DTZUZU_ joins (~DTZUZO@207.81.119.43) |
| 2021-04-12 01:11:04 | × | whataday quits (~xxx@2400:8902::f03c:92ff:fe60:98d8) (Remote host closed the connection) |
| 2021-04-12 01:11:44 | <d34df00d> | koz_: well, I have this code for doing IDCT (the bottom-most function, idctBlocks, is doing that, plus collecting all of the results to ensure things are fully evaluated, but that's perhaps irrelevant): |
| 2021-04-12 01:11:50 | <d34df00d> | https://bpaste.net/PXTA |
| 2021-04-12 01:11:56 | <d34df00d> | It's also built with -fllvm -O2 |
| 2021-04-12 01:12:11 | → | whataday joins (~xxx@2400:8902::f03c:92ff:fe60:98d8) |
| 2021-04-12 01:12:12 | × | DTZUZU quits (~DTZUZO@205.ip-149-56-132.net) (Ping timeout: 240 seconds) |
| 2021-04-12 01:12:57 | <d34df00d> | And it is ridiculously slow. It takes about 2 seconds of CPU time on some test data I have (which has about 1 million of 8×8 matrices over which IDCT happens, so about 64 million elements in the vector that the function takes). |
| 2021-04-12 01:13:04 | → | ulfryk joins (~ulfryk@2a01:4b00:872d:e600:a55a:b8e3:54cc:d8d6) |
| 2021-04-12 01:13:29 | <d34df00d> | My rough estimate of the time required for this is from 250 milliseconds (for dumb, scalar code) to about 30 ms if SIMD is involved. |
| 2021-04-12 01:14:03 | <d34df00d> | So I wonder what I'm doing wrong and how can I make this faster. |
| 2021-04-12 01:15:05 | <d34df00d> | Ah, and the performance of the code is insensitive to whether I'm going row-wise or column-wise — replacing arrSlice = R.unsafeSlice arr (sh :. x :. All) with arrSlice = R.unsafeSlice arr (sh :. All :. x) there (and similarly for idctSlice) has no effect on performance whatsoever. |
| 2021-04-12 01:15:14 | <d34df00d> | Which definitely should not happen for a well-optimized code. |
| 2021-04-12 01:17:38 | × | ulfryk quits (~ulfryk@2a01:4b00:872d:e600:a55a:b8e3:54cc:d8d6) (Ping timeout: 258 seconds) |
| 2021-04-12 01:17:58 | × | quinn quits (~quinn@c-73-223-224-163.hsd1.ca.comcast.net) (Ping timeout: 240 seconds) |
| 2021-04-12 01:18:38 | → | ulfryk joins (~ulfryk@2a01:4b00:872d:e600:a55a:b8e3:54cc:d8d6) |
| 2021-04-12 01:19:31 | → | quinn joins (~quinn@c-73-223-224-163.hsd1.ca.comcast.net) |
| 2021-04-12 01:19:49 | → | wroathe joins (~wroathe@c-68-54-25-135.hsd1.mn.comcast.net) |
| 2021-04-12 01:21:14 | → | DTZUZU joins (~DTZUZO@205.ip-149-56-132.net) |
| 2021-04-12 01:21:22 | × | quinn quits (~quinn@c-73-223-224-163.hsd1.ca.comcast.net) (Client Quit) |
| 2021-04-12 01:22:43 | × | xff0x quits (~xff0x@2001:1a81:5278:bf00:33a0:2c0f:72ed:caee) (Ping timeout: 260 seconds) |
| 2021-04-12 01:23:11 | × | ulfryk quits (~ulfryk@2a01:4b00:872d:e600:a55a:b8e3:54cc:d8d6) (Ping timeout: 260 seconds) |
| 2021-04-12 01:23:18 | <koz_> | What difference(s) do you observe without -fllvm? |
| 2021-04-12 01:23:26 | × | DTZUZU_ quits (~DTZUZO@207.81.119.43) (Ping timeout: 240 seconds) |
| 2021-04-12 01:24:14 | → | ulfryk joins (~ulfryk@2a01:4b00:872d:e600:a55a:b8e3:54cc:d8d6) |
| 2021-04-12 01:24:18 | → | xff0x joins (~xff0x@2001:1a81:52af:1400:3ce5:1261:85cb:8b42) |
| 2021-04-12 01:27:14 | <d34df00d> | Oh, much slower. |
| 2021-04-12 01:27:17 | <d34df00d> | Still waiting… |
| 2021-04-12 01:27:33 | <koz_> | OK, and I guess the same _with_ -fllvm but with -O1? |
| 2021-04-12 01:27:40 | <koz_> | I'm trying to rule out weird regressions. |
| 2021-04-12 01:27:50 | <d34df00d> | Alrighty, done waiting. 45 seconds with -fasm for that module vs -fllvm. |
| 2021-04-12 01:28:09 | <d34df00d> | vs 2 for -fllvm, that is. |
| 2021-04-12 01:28:12 | <d34df00d> | Let me try -O1 now. |
| 2021-04-12 01:28:26 | <d34df00d> | Yeah, -O1 with -fllvm is slower, but not that much — 3.3 seconds vs 2 seconds. |
| 2021-04-12 01:28:36 | <koz_> | OK so not a weird regression. |
| 2021-04-12 01:28:40 | koz_ | thinks a bit. |
| 2021-04-12 01:28:49 | × | ulfryk quits (~ulfryk@2a01:4b00:872d:e600:a55a:b8e3:54cc:d8d6) (Ping timeout: 250 seconds) |
| 2021-04-12 01:29:11 | → | quinn joins (~quinn@c-73-223-224-163.hsd1.ca.comcast.net) |
| 2021-04-12 01:29:19 | × | quinn quits (~quinn@c-73-223-224-163.hsd1.ca.comcast.net) (Client Quit) |
| 2021-04-12 01:29:32 | <d34df00d> | I mean, as a next step I could either try accelerate instead of repa, or try to write that stuff myself with primops (too bad ghc primops don't have horizontal add, meh!), or dunno. |
| 2021-04-12 01:29:35 | <koz_> | Yeah, might be better see if anyone knows, cause I'm a bit mystified. |
| 2021-04-12 01:29:47 | <d34df00d> | But I have a gut feel that repa can do better than that, and I'd like to know how. |
| 2021-04-12 01:30:15 | <d34df00d> | I mean, it cannot be one or two orders of magnitude slower than something that's somewhat easily achievable. |
| 2021-04-12 01:30:35 | <koz_> | Repa is meant to emit CPU code right? |
| 2021-04-12 01:30:38 | <koz_> | Have you tried massiv? |
| 2021-04-12 01:30:39 | <d34df00d> | Yep. |
| 2021-04-12 01:30:41 | <d34df00d> | Nope. |
| 2021-04-12 01:30:45 | <koz_> | I'd be curious if massiv could do better. |
| 2021-04-12 01:31:16 | <koz_> | For Cabal files, if I wanna detect being on a Mac, I test 'os(macos)', right? |
| 2021-04-12 01:31:56 | <d34df00d> | Yeah, I'll definitely give it a shot! I never really used massiv, so I'll be curious to see how it performs. |
| 2021-04-12 01:32:08 | → | merijn joins (~merijn@83-160-49-249.ip.xs4all.nl) |
| 2021-04-12 01:32:13 | <d34df00d> | In another task where I used repa, it was very close to what I would expect performance-wise. |
| 2021-04-12 01:32:51 | <koz_> | Yeah, but there is definitely some Repa-specific specialist knowledge required to make sense of what will run well or not. |
All times are in UTC.