Logs: freenode/#haskell
| 2021-05-04 13:18:29 | <edmundnoble> | Well I'm not entirely sure, but it seems to me that Haskell threads are switched between extremely often |
| 2021-05-04 13:18:35 | <edmundnoble> | So say I have `n` tasks of roughly the same shape |
| 2021-05-04 13:18:40 | <edmundnoble> | Regardless of how cheap Haskell threads are |
| 2021-05-04 13:18:45 | → | valstan joins (56788fbb@86.120.143.187) |
| 2021-05-04 13:18:51 | <edmundnoble> | If I run `n` of them concurrently, I can expect a cost of `n` times as much memory usage |
| 2021-05-04 13:18:57 | <merijn> | edmundnoble: And what makes you think the same won't happen for sparks? |
| 2021-05-04 13:19:05 | <edmundnoble> | Because sparks are run on idle processors |
| 2021-05-04 13:19:09 | <hyperisco> | should be as simple as getting a [(Text,Text)] and a [Text] |
| 2021-05-04 13:19:11 | <tdammers> | Haskell threads are "green threads", they are distributed over a number of OS threads ("capabilities") as the RTS sees fit, though you have some influence on the heuristics |
| 2021-05-04 13:19:13 | × | __minoru__shirae quits (~shiraeesh@46.34.207.226) (Ping timeout: 260 seconds) |
| 2021-05-04 13:19:25 | <hyperisco> | which I can just do myself really |
| 2021-05-04 13:19:26 | <merijn> | edmundnoble: "idle processor" just means "capability not currently executing a Haskell thread" |
| 2021-05-04 13:19:40 | <edmundnoble> | Yes |
| 2021-05-04 13:19:43 | <edmundnoble> | That makes sense to me |
| 2021-05-04 13:19:55 | <edmundnoble> | The difference is |
| 2021-05-04 13:20:05 | <merijn> | edmundnoble: Nothing about sparks guarantees they evaluate to completion before being pre-empted, though |
| 2021-05-04 13:20:06 | <edmundnoble> | Once you've got each capability running a spark |
| 2021-05-04 13:20:09 | → | jespada joins (~jespada@87.74.37.248) |
| 2021-05-04 13:20:22 | <edmundnoble> | Uh |
| 2021-05-04 13:20:29 | <merijn> | at least, nothing I'm aware off |
| 2021-05-04 13:20:31 | <edmundnoble> | Is that true? I have read that sparks are only run on idle processors... |
| 2021-05-04 13:20:38 | × | Rudd0 quits (~Rudd0@185.189.115.108) (Ping timeout: 260 seconds) |
| 2021-05-04 13:20:39 | <merijn> | edmundnoble: So? |
| 2021-05-04 13:20:43 | <mniip> | so do threads |
| 2021-05-04 13:20:47 | <merijn> | edmundnoble: How does that guarantee they won't be interrupted? |
| 2021-05-04 13:20:56 | <edmundnoble> | They won't be interrupted by other sparks |
| 2021-05-04 13:21:00 | <edmundnoble> | That's the only thing I'm asking for |
| 2021-05-04 13:21:07 | <merijn> | edmundnoble: Won't they be? |
| 2021-05-04 13:21:18 | <edmundnoble> | Again, sparks are only run on idle processors |
| 2021-05-04 13:21:24 | <merijn> | edmundnoble: So? |
| 2021-05-04 13:21:28 | <edmundnoble> | If a processor running a spark has been interrupted, it was given work |
| 2021-05-04 13:21:31 | <edmundnoble> | So it's not idle |
| 2021-05-04 13:21:35 | <merijn> | edmundnoble: That just means "we don't unschedule Haskell threads" |
| 2021-05-04 13:21:49 | <merijn> | edmundnoble: It doesn't say "and we will process a spark until completion without preemption" |
| 2021-05-04 13:22:04 | <edmundnoble> | If a spark has already been sparked, it's been converted to a Haskell thread |
| 2021-05-04 13:22:10 | <edmundnoble> | This conversion as far as I understand is one-way |
| 2021-05-04 13:22:19 | <edmundnoble> | If you have enough Haskell threads to saturate the available capabilities |
| 2021-05-04 13:22:32 | <merijn> | sparks are never haskell threads, sparks, as far as I recall, are a completely separate abstraction in the RTS |
| 2021-05-04 13:22:33 | <edmundnoble> | And the threads don't go idel |
| 2021-05-04 13:22:59 | <mniip> | I think you're asking for thread priorities? |
| 2021-05-04 13:23:13 | <edmundnoble> | Sparks are converted into threads |
| 2021-05-04 13:23:15 | <edmundnoble> | That's what "sparking" is |
| 2021-05-04 13:23:23 | <merijn> | mniip: It's unclear, because not enough unstated context |
| 2021-05-04 13:23:40 | <int-e> | edmundnoble: No. There are worker threads that grab items from a queue of sparks and execute them. |
| 2021-05-04 13:23:52 | <merijn> | edmundnoble: Where does it say that? |
| 2021-05-04 13:23:56 | <int-e> | So it's the same thread for many sparks. |
| 2021-05-04 13:23:58 | <merijn> | edmundnoble: I've never seen that in any paper |
| 2021-05-04 13:24:05 | <edmundnoble> | The user's guide for GHC |
| 2021-05-04 13:24:22 | × | ValeraRozuvan quits (~ValeraRoz@95.164.65.159) (Quit: ValeraRozuvan) |
| 2021-05-04 13:24:29 | <edmundnoble> | So wait, to be clear |
| 2021-05-04 13:24:39 | <int-e> | (and these threads are scheduled on capabilities that would otherwise be idle) |
| 2021-05-04 13:24:57 | <edmundnoble> | Right okay, so sparks are run on worker threads which are scheduled on capabilities that are otherwise idle |
| 2021-05-04 13:25:01 | <merijn> | edmundnoble: I don't see any reference to that in the user guide? Where did you see that? |
| 2021-05-04 13:25:23 | <merijn> | (the sparking converting to threads) |
| 2021-05-04 13:26:48 | <merijn> | I mean, sparks are just a work queue with dynamically spawned worker threads evaluating them, if you're in IO you might as well fork a number of worker threads and use a queue and get literally the same behaviour without needing to do a whole investigation whether its safe to do IO |
| 2021-05-04 13:26:53 | <edmundnoble> | int-e: I'm wondering if I can sanely use these worker threads and the queue for computations in IO |
| 2021-05-04 13:27:06 | <merijn> | edmundnoble: Not sanely, no |
| 2021-05-04 13:27:33 | <merijn> | "crazily" might be possible, given sufficient effort, but why bother |
| 2021-05-04 13:27:57 | <int-e> | hmm, I have seen the "conversion" terminology somewhere though |
| 2021-05-04 13:27:59 | <edmundnoble> | Essentially my question is whether I can get a program-wide way to limit concurrency which instead of interfering with the spark queue, cooperates with it |
| 2021-05-04 13:28:05 | <int-e> | (a spark is either converted or fizzles) |
| 2021-05-04 13:28:12 | <yushyin> | 'If the runtime detects that there is an idle CPU, then it may convert a spark into a real thread' https://downloads.haskell.org/ghc/latest/docs/html/users_guide/exts/concurrent.html?#annotating-pure-code-for-parallelism |
| 2021-05-04 13:28:25 | <int-e> | it's just the notion that each spark becomes its own thread that does not correspond to reality |
| 2021-05-04 13:28:32 | <merijn> | edmundnoble: I mean, you can just spawn a limited number of workers using a single queue and then you're done? |
| 2021-05-04 13:28:35 | <yushyin> | don't know what a 'real thread' is in this context |
| 2021-05-04 13:28:44 | <edmundnoble> | A Haskell thread, I think, yushyin |
| 2021-05-04 13:29:10 | <edmundnoble> | Right, that would be the idea merijn, but I see no reason to have two schedulers in my program when one can do just as well |
| 2021-05-04 13:29:23 | <merijn> | Where does the 2nd scheduler come in? |
| 2021-05-04 13:29:26 | <int-e> | yushyin: I mean a separate forkIO/createThread() setup. |
| 2021-05-04 13:29:29 | <merijn> | Why do you need a 2nd scheduler? |
| 2021-05-04 13:29:34 | → | __minoru__shirae joins (~shiraeesh@46.34.207.226) |
| 2021-05-04 13:29:39 | <edmundnoble> | Oh you know what scheduler is the wrong term |
| 2021-05-04 13:29:51 | <merijn> | Just spawn threads and put out work as needed? |
| 2021-05-04 13:29:52 | <edmundnoble> | Two work queues distributing work to Haskell threads |
| 2021-05-04 13:30:05 | → | todda7 joins (~torstein@2a02:587:3724:1a75:aca:df22:9d82:969f) |
| 2021-05-04 13:30:07 | <merijn> | Why would there be two queues? |
| 2021-05-04 13:30:11 | × | valstan quits (56788fbb@86.120.143.187) (Quit: Connection closed) |
| 2021-05-04 13:30:22 | <edmundnoble> | There's one queue for sparks, and one queue for my workers |
| 2021-05-04 13:30:58 | <merijn> | the spark queue only exists if you have sparks, though |
| 2021-05-04 13:31:48 | <edmundnoble> | Okay, so why should there be two queues and two sets of worker threads if I have sparks and I have my own tasks? |
| 2021-05-04 13:32:13 | <int-e> | yushyin: of course that's an implementation detail; pure code is not allowed to care about threads |
| 2021-05-04 13:32:15 | <merijn> | because IO is not a spark task |
| 2021-05-04 13:32:36 | <edmundnoble> | And... why not |
| 2021-05-04 13:32:44 | <edmundnoble> | What kind of craziness could I expect? |
| 2021-05-04 13:32:45 | <merijn> | Because spark tasks *must* be pure |
| 2021-05-04 13:32:59 | <edmundnoble> | Really? |
| 2021-05-04 13:33:01 | <merijn> | Because they may or may not be evaluated in parallel (or at all!) |
| 2021-05-04 13:33:18 | <merijn> | If you have side-effects then it matters whether you get evaluated and in which order |
| 2021-05-04 13:33:28 | <int-e> | all this talk about sparks not being interrupted really just means that unsafePerformIO in sparks is about as safe as anywhere else outside of STM. |
| 2021-05-04 13:33:30 | <edmundnoble> | Okay, and what if I have side effects which don't need to be evaluated at all, or in parallel |
| 2021-05-04 13:33:32 | <merijn> | The entire reason sparks exists is to do evaluation of pure code in parallel |
| 2021-05-04 13:33:46 | <merijn> | int-e: That was my point, yes |
| 2021-05-04 13:34:03 | <int-e> | merijn: sorry, I didn't read much of context |
| 2021-05-04 13:34:12 | <merijn> | In essence this entire line of questions boils down to "is unsafePerformIO bad?" |
| 2021-05-04 13:34:15 | <edmundnoble> | Nah |
| 2021-05-04 13:34:19 | → | ddellaco_ joins (~ddellacos@ool-44c73afa.dyn.optonline.net) |
| 2021-05-04 13:34:23 | <merijn> | To which the answer is: Yes |
| 2021-05-04 13:34:34 | <edmundnoble> | It actually boils down to "what semantic insanity can I expect *specifically* from using unsafePerformIO in sparks" |
| 2021-05-04 13:34:41 | <edmundnoble> | This was, in fact, my initial question |
| 2021-05-04 13:34:52 | → | ddellac__ joins (ddellacost@gateway/vpn/mullvad/ddellacosta) |
All times are in UTC.