Home freenode/#haskell: Logs Calendar

Logs: freenode/#haskell

←Prev  Next→ 502,152 events total
2021-05-04 13:18:29 <edmundnoble> Well I'm not entirely sure, but it seems to me that Haskell threads are switched between extremely often
2021-05-04 13:18:35 <edmundnoble> So say I have `n` tasks of roughly the same shape
2021-05-04 13:18:40 <edmundnoble> Regardless of how cheap Haskell threads are
2021-05-04 13:18:45 valstan joins (56788fbb@86.120.143.187)
2021-05-04 13:18:51 <edmundnoble> If I run `n` of them concurrently, I can expect a cost of `n` times as much memory usage
2021-05-04 13:18:57 <merijn> edmundnoble: And what makes you think the same won't happen for sparks?
2021-05-04 13:19:05 <edmundnoble> Because sparks are run on idle processors
2021-05-04 13:19:09 <hyperisco> should be as simple as getting a [(Text,Text)] and a [Text]
2021-05-04 13:19:11 <tdammers> Haskell threads are "green threads", they are distributed over a number of OS threads ("capabilities") as the RTS sees fit, though you have some influence on the heuristics
2021-05-04 13:19:13 × __minoru__shirae quits (~shiraeesh@46.34.207.226) (Ping timeout: 260 seconds)
2021-05-04 13:19:25 <hyperisco> which I can just do myself really
2021-05-04 13:19:26 <merijn> edmundnoble: "idle processor" just means "capability not currently executing a Haskell thread"
2021-05-04 13:19:40 <edmundnoble> Yes
2021-05-04 13:19:43 <edmundnoble> That makes sense to me
2021-05-04 13:19:55 <edmundnoble> The difference is
2021-05-04 13:20:05 <merijn> edmundnoble: Nothing about sparks guarantees they evaluate to completion before being pre-empted, though
2021-05-04 13:20:06 <edmundnoble> Once you've got each capability running a spark
2021-05-04 13:20:09 jespada joins (~jespada@87.74.37.248)
2021-05-04 13:20:22 <edmundnoble> Uh
2021-05-04 13:20:29 <merijn> at least, nothing I'm aware off
2021-05-04 13:20:31 <edmundnoble> Is that true? I have read that sparks are only run on idle processors...
2021-05-04 13:20:38 × Rudd0 quits (~Rudd0@185.189.115.108) (Ping timeout: 260 seconds)
2021-05-04 13:20:39 <merijn> edmundnoble: So?
2021-05-04 13:20:43 <mniip> so do threads
2021-05-04 13:20:47 <merijn> edmundnoble: How does that guarantee they won't be interrupted?
2021-05-04 13:20:56 <edmundnoble> They won't be interrupted by other sparks
2021-05-04 13:21:00 <edmundnoble> That's the only thing I'm asking for
2021-05-04 13:21:07 <merijn> edmundnoble: Won't they be?
2021-05-04 13:21:18 <edmundnoble> Again, sparks are only run on idle processors
2021-05-04 13:21:24 <merijn> edmundnoble: So?
2021-05-04 13:21:28 <edmundnoble> If a processor running a spark has been interrupted, it was given work
2021-05-04 13:21:31 <edmundnoble> So it's not idle
2021-05-04 13:21:35 <merijn> edmundnoble: That just means "we don't unschedule Haskell threads"
2021-05-04 13:21:49 <merijn> edmundnoble: It doesn't say "and we will process a spark until completion without preemption"
2021-05-04 13:22:04 <edmundnoble> If a spark has already been sparked, it's been converted to a Haskell thread
2021-05-04 13:22:10 <edmundnoble> This conversion as far as I understand is one-way
2021-05-04 13:22:19 <edmundnoble> If you have enough Haskell threads to saturate the available capabilities
2021-05-04 13:22:32 <merijn> sparks are never haskell threads, sparks, as far as I recall, are a completely separate abstraction in the RTS
2021-05-04 13:22:33 <edmundnoble> And the threads don't go idel
2021-05-04 13:22:59 <mniip> I think you're asking for thread priorities?
2021-05-04 13:23:13 <edmundnoble> Sparks are converted into threads
2021-05-04 13:23:15 <edmundnoble> That's what "sparking" is
2021-05-04 13:23:23 <merijn> mniip: It's unclear, because not enough unstated context
2021-05-04 13:23:40 <int-e> edmundnoble: No. There are worker threads that grab items from a queue of sparks and execute them.
2021-05-04 13:23:52 <merijn> edmundnoble: Where does it say that?
2021-05-04 13:23:56 <int-e> So it's the same thread for many sparks.
2021-05-04 13:23:58 <merijn> edmundnoble: I've never seen that in any paper
2021-05-04 13:24:05 <edmundnoble> The user's guide for GHC
2021-05-04 13:24:22 × ValeraRozuvan quits (~ValeraRoz@95.164.65.159) (Quit: ValeraRozuvan)
2021-05-04 13:24:29 <edmundnoble> So wait, to be clear
2021-05-04 13:24:39 <int-e> (and these threads are scheduled on capabilities that would otherwise be idle)
2021-05-04 13:24:57 <edmundnoble> Right okay, so sparks are run on worker threads which are scheduled on capabilities that are otherwise idle
2021-05-04 13:25:01 <merijn> edmundnoble: I don't see any reference to that in the user guide? Where did you see that?
2021-05-04 13:25:23 <merijn> (the sparking converting to threads)
2021-05-04 13:26:48 <merijn> I mean, sparks are just a work queue with dynamically spawned worker threads evaluating them, if you're in IO you might as well fork a number of worker threads and use a queue and get literally the same behaviour without needing to do a whole investigation whether its safe to do IO
2021-05-04 13:26:53 <edmundnoble> int-e: I'm wondering if I can sanely use these worker threads and the queue for computations in IO
2021-05-04 13:27:06 <merijn> edmundnoble: Not sanely, no
2021-05-04 13:27:33 <merijn> "crazily" might be possible, given sufficient effort, but why bother
2021-05-04 13:27:57 <int-e> hmm, I have seen the "conversion" terminology somewhere though
2021-05-04 13:27:59 <edmundnoble> Essentially my question is whether I can get a program-wide way to limit concurrency which instead of interfering with the spark queue, cooperates with it
2021-05-04 13:28:05 <int-e> (a spark is either converted or fizzles)
2021-05-04 13:28:12 <yushyin> 'If the runtime detects that there is an idle CPU, then it may convert a spark into a real thread' https://downloads.haskell.org/ghc/latest/docs/html/users_guide/exts/concurrent.html?#annotating-pure-code-for-parallelism
2021-05-04 13:28:25 <int-e> it's just the notion that each spark becomes its own thread that does not correspond to reality
2021-05-04 13:28:32 <merijn> edmundnoble: I mean, you can just spawn a limited number of workers using a single queue and then you're done?
2021-05-04 13:28:35 <yushyin> don't know what a 'real thread' is in this context
2021-05-04 13:28:44 <edmundnoble> A Haskell thread, I think, yushyin
2021-05-04 13:29:10 <edmundnoble> Right, that would be the idea merijn, but I see no reason to have two schedulers in my program when one can do just as well
2021-05-04 13:29:23 <merijn> Where does the 2nd scheduler come in?
2021-05-04 13:29:26 <int-e> yushyin: I mean a separate forkIO/createThread() setup.
2021-05-04 13:29:29 <merijn> Why do you need a 2nd scheduler?
2021-05-04 13:29:34 __minoru__shirae joins (~shiraeesh@46.34.207.226)
2021-05-04 13:29:39 <edmundnoble> Oh you know what scheduler is the wrong term
2021-05-04 13:29:51 <merijn> Just spawn threads and put out work as needed?
2021-05-04 13:29:52 <edmundnoble> Two work queues distributing work to Haskell threads
2021-05-04 13:30:05 todda7 joins (~torstein@2a02:587:3724:1a75:aca:df22:9d82:969f)
2021-05-04 13:30:07 <merijn> Why would there be two queues?
2021-05-04 13:30:11 × valstan quits (56788fbb@86.120.143.187) (Quit: Connection closed)
2021-05-04 13:30:22 <edmundnoble> There's one queue for sparks, and one queue for my workers
2021-05-04 13:30:58 <merijn> the spark queue only exists if you have sparks, though
2021-05-04 13:31:48 <edmundnoble> Okay, so why should there be two queues and two sets of worker threads if I have sparks and I have my own tasks?
2021-05-04 13:32:13 <int-e> yushyin: of course that's an implementation detail; pure code is not allowed to care about threads
2021-05-04 13:32:15 <merijn> because IO is not a spark task
2021-05-04 13:32:36 <edmundnoble> And... why not
2021-05-04 13:32:44 <edmundnoble> What kind of craziness could I expect?
2021-05-04 13:32:45 <merijn> Because spark tasks *must* be pure
2021-05-04 13:32:59 <edmundnoble> Really?
2021-05-04 13:33:01 <merijn> Because they may or may not be evaluated in parallel (or at all!)
2021-05-04 13:33:18 <merijn> If you have side-effects then it matters whether you get evaluated and in which order
2021-05-04 13:33:28 <int-e> all this talk about sparks not being interrupted really just means that unsafePerformIO in sparks is about as safe as anywhere else outside of STM.
2021-05-04 13:33:30 <edmundnoble> Okay, and what if I have side effects which don't need to be evaluated at all, or in parallel
2021-05-04 13:33:32 <merijn> The entire reason sparks exists is to do evaluation of pure code in parallel
2021-05-04 13:33:46 <merijn> int-e: That was my point, yes
2021-05-04 13:34:03 <int-e> merijn: sorry, I didn't read much of context
2021-05-04 13:34:12 <merijn> In essence this entire line of questions boils down to "is unsafePerformIO bad?"
2021-05-04 13:34:15 <edmundnoble> Nah
2021-05-04 13:34:19 ddellaco_ joins (~ddellacos@ool-44c73afa.dyn.optonline.net)
2021-05-04 13:34:23 <merijn> To which the answer is: Yes
2021-05-04 13:34:34 <edmundnoble> It actually boils down to "what semantic insanity can I expect *specifically* from using unsafePerformIO in sparks"
2021-05-04 13:34:41 <edmundnoble> This was, in fact, my initial question
2021-05-04 13:34:52 ddellac__ joins (ddellacost@gateway/vpn/mullvad/ddellacosta)

All times are in UTC.