Home freenode/#haskell: Logs Calendar

Logs: freenode/#haskell

←Prev  Next→ 502,152 events total
2021-04-23 18:00:28 carlomagno joins (~cararell@148.87.23.4)
2021-04-23 18:00:32 <monochrom> And never need VMContinue, ExceptT already does that.
2021-04-23 18:00:48 <jrp> Even cooler
2021-04-23 18:00:51 my_name_is_not_j joins (mynameisno@gateway/shell/matrix.org/x-hhwvpqfmykooxqsa)
2021-04-23 18:01:38 <minoru_shiraeesh> maybe it's the continue that invoke inside a loop
2021-04-23 18:02:20 <monochrom> Ah geekosaur was saying what I just said.
2021-04-23 18:03:03 <minoru_shiraeesh> but usually you also have a break keyword in addition to continue
2021-04-23 18:03:08 lordcirth_ joins (~lordcirth@2607:f2c0:95b3:4400:11af:5eb6:2b18:3df9)
2021-04-23 18:03:23 × lordcirth__ quits (~lordcirth@2607:f2c0:95b3:4400:11af:5eb6:2b18:3df9) (Ping timeout: 250 seconds)
2021-04-23 18:03:36 × DTZUZU quits (~DTZUZO@205.ip-149-56-132.net) (Read error: Connection reset by peer)
2021-04-23 18:03:44 <monochrom> minoru_shiraeesh, does Forth have a break keyword?
2021-04-23 18:03:51 <minoru_shiraeesh> idk
2021-04-23 18:03:57 DTZUZU joins (~DTZUZO@205.ip-149-56-132.net)
2021-04-23 18:04:04 <monochrom> Thought so.
2021-04-23 18:04:08 × nut quits (~gtk@roc37-h01-176-170-197-243.dsl.sta.abo.bbox.fr) (Ping timeout: 265 seconds)
2021-04-23 18:05:00 jesser[m] joins (jessermatr@gateway/shell/matrix.org/x-aupqwvyiwubojerx)
2021-04-23 18:05:11 mnrmnaugh joins (~mnrmnaugh@unaffiliated/mnrmnaugh)
2021-04-23 18:05:54 <minoru_shiraeesh> that's what I'm saying, I'm not sure what that "continue" mean, but seems unlikely that it's a loop's continue
2021-04-23 18:06:09 <monochrom> It means the "next" in the old version.
2021-04-23 18:06:25 × nineonine quits (~nineonine@2604:3d08:7785:9600:35c4:856c:8487:6e07) (Ping timeout: 250 seconds)
2021-04-23 18:06:47 × LKoen quits (~LKoen@11.160.9.109.rev.sfr.net) (Remote host closed the connection)
2021-04-23 18:06:50 <wroathe> So I'm working on an HTML parser, serializer, and validator that may or may not see the light of day, but one of the thoughts that's been in the back of my head with the API is the old debate between String, Text, and ByteString. Some libraries make their functions polymorphic with the IsString constraint to allow the user to choose the string-like type that they're working with, and others opt for
2021-04-23 18:06:50 nineonine joins (~nineonine@50.216.62.2)
2021-04-23 18:06:56 <wroathe> ByteString only, and others still opt for individual functions for each type of string they support. Does any of you have strong opinions on this debate?
2021-04-23 18:07:51 lawr3nce joins (~lawr3nce@gateway/tor-sasl/lawr3nce)
2021-04-23 18:09:00 <monochrom> And there is also a debate on "when you say that, do you mean the input? do you mean the output?"
2021-04-23 18:09:18 <int-e> wroathe: It's an awful debate ;-)
2021-04-23 18:09:30 <wroathe> int-e: Why?
2021-04-23 18:09:30 <jrp> No, but it has internally an ?exit keyword, so if I run a block, I can bail before completing it  (so yes, break).  The break should not, however, propagate.  Execution should continue to the next word in the calling block.  Is there a neat forthBlock that achieves that?
2021-04-23 18:10:07 HiRE joins (~HiRE@2602:ffc5:20::1:512e)
2021-04-23 18:10:09 <monochrom> ExceptT has a try-catch mechanism.
2021-04-23 18:10:09 dpl joins (~dpl@77-121-78-163.chn.volia.net)
2021-04-23 18:10:16 <int-e> wroathe: So many strong opinions.
2021-04-23 18:10:20 × heatsink quits (~heatsink@108-201-191-115.lightspeed.sntcca.sbcglobal.net) (Remote host closed the connection)
2021-04-23 18:10:22 <wroathe> monochrom: Mind elaborating?
2021-04-23 18:10:47 <monochrom> Your parser takes an input and emits an output.
2021-04-23 18:11:25 <monochrom> The debate for "input should use String vs ByteString vs Text" is very different from the debat for "output should use String vs ByteString vs Text". In fact the opposite debate.
2021-04-23 18:13:03 <wroathe> monochrom: Well, AIUI the best for efficiency would be ByteString both ways, but if Haskell makes allowing the user to choose which string type they want to work with relatively easy, why not just IsString everything?
2021-04-23 18:13:44 <monochrom> Clearly, IsString only covers one end, I forgot which.
2021-04-23 18:14:11 <maerwald> :t fromString
2021-04-23 18:14:12 <lambdabot> error:
2021-04-23 18:14:12 <lambdabot> • Variable not in scope: fromString
2021-04-23 18:14:13 <lambdabot> • Perhaps you meant one of these:
2021-04-23 18:14:24 <monochrom> And I suspect that a truly general API would be polymorphic in both input and output but independently.
2021-04-23 18:14:36 <int-e> HTML is messy anyway... ideally you should be told the encoding (and then the answer is fairly straightforward, you'd want Text for the parsing part), but sometimes you don't and need to find a charset declaration of some sorts... probably treating the input as something byte-based like latin1 up to that point.
2021-04-23 18:14:48 <maerwald> @hoogle fromString
2021-04-23 18:14:49 <lambdabot> Data.String fromString :: IsString a => String -> a
2021-04-23 18:14:49 <lambdabot> GHC.Exts fromString :: IsString a => String -> a
2021-04-23 18:14:49 <lambdabot> Data.Text.Internal.Builder fromString :: String -> Builder
2021-04-23 18:14:52 <jrp> monochrom so I need to try, catch, rethrow other than VMExit?
2021-04-23 18:15:06 <monochrom> yes
2021-04-23 18:15:54 <monochrom> Or don't use ExceptT. Define your own type that's like ExceptT but better.
2021-04-23 18:15:54 geowiesnot joins (~user@i15-les02-ix2-87-89-181-157.sfr.lns.abo.bbox.fr)
2021-04-23 18:15:58 lordcirth__ joins (~lordcirth@2607:f2c0:95b3:4400:11af:5eb6:2b18:3df9)
2021-04-23 18:16:09 <MrMobius> jrp, what is the forth for?
2021-04-23 18:16:10 timCF joins (~i.tkachuk@m91-129-104-226.cust.tele2.ee)
2021-04-23 18:16:19 <monochrom> ExceptT has two cases, but nothing says you can't plagiarize it and add one more case.
2021-04-23 18:17:00 × juuandyy quits (~juuandyy@90.106.228.121) (Quit: Konversation terminated!)
2021-04-23 18:17:45 <wroathe> int-e: Yeah, I'm just digging into the parser design right now and encoding was something that I figured would throw a wrench into this high level plan
2021-04-23 18:18:12 juuandyy joins (~juuandyy@90.106.228.121)
2021-04-23 18:18:33 × lordcirth_ quits (~lordcirth@2607:f2c0:95b3:4400:11af:5eb6:2b18:3df9) (Ping timeout: 250 seconds)
2021-04-23 18:18:56 <jrp> MrMobius in my case an exercise, but it's normally used for programming fpgas or other low-level devices.  It's low overhead, fast and easy to bootstrap onto new devices
2021-04-23 18:19:11 <int-e> wroathe: For sanity it's probably best to seaparate the part that guesses the encoding from the actual parsing.
2021-04-23 18:19:14 <monochrom> Early experience proved that if you use String in the output, it takes too much space. No one would debate against that.
2021-04-23 18:19:14 LKoen joins (~LKoen@11.160.9.109.rev.sfr.net)
2021-04-23 18:19:21 <timCF> Hello! Is expression `a <|> b <|> c` lazy evaluated in case where every part is `ExceptT e m a` where `m` is some IO-like monad with side effects? I mean let's say `a` returned Left, then `b` returned Right. Will side-effects of `c` evaluated in this case?
2021-04-23 18:19:27 <wroathe> int-e: Yup. That was exactly my thinking.
2021-04-23 18:19:44 <int-e> monochrom: Well, unless it's consumed (written to a file, say) on the spot...
2021-04-23 18:19:53 <monochrom> The rest of the debate seems to be simply, as usual, people disagreeing on even what "HTML parser" means.
2021-04-23 18:20:02 nut joins (~gtk@roc37-h01-176-170-197-243.dsl.sta.abo.bbox.fr)
2021-04-23 18:20:09 <wroathe> monochrom: Yeah, the main purpose of allowing for String was for ease of repl experimentation
2021-04-23 18:20:20 × vv8 quits (~egp_@2.95.117.163) (Quit: EXIT)
2021-04-23 18:20:36 <int-e> (IME, when data is basically streamed, String works much better than most people think.)
2021-04-23 18:20:47 <jrp> monochrom  I may end up writing an alternative ExceptT but I was trying to build on the existing ecosystem (hence the desire to use folds or other existing apparatus
2021-04-23 18:20:52 <int-e> But *storing* text chunks as String is costly.
2021-04-23 18:20:58 × bitmagie quits (~Thunderbi@200116b806a8c30018046d968b59bdfe.dip.versatel-1u1.de) (Quit: bitmagie)
2021-04-23 18:21:13 <monochrom> Just for starters: HTML5 seems to say that the encoding can still be indeterminate until you read something in the header. But there are people who don't share this view.
2021-04-23 18:21:14 redmp joins (~redmp@172.58.35.164)
2021-04-23 18:21:16 <Clint> jrp: not sure what you're trying to do but ChronicleT is like ExceptT with a third case
2021-04-23 18:21:46 <monochrom> Right there you can have a "debate" on ByteString vs Text and it's just because people disagree on "should you assume an encoding before you start?"
2021-04-23 18:23:25 <monochrom> As usual, the correct answer is "it depends".
2021-04-23 18:23:53 <wroathe> monochrom: Valid points. I guess I can't punt on thinking about the encoding piece of this any longer, as it's all integrated.
2021-04-23 18:24:09 <monochrom> Some people write an HTML parser for scraping. In this case you don't assume an encoding, and you don't decode either, you can stay ByteString.
2021-04-23 18:24:36 <monochrom> Some other people write an HTML parser because they're writing a web browser. They have the opposite stake.
2021-04-23 18:24:42 <jrp> Thanks Clint  I'm trying to run a sequence of actions one of which might be a conditional break/exit   I'll have a look at Chronicle
2021-04-23 18:24:55 <wroathe> monochrom: I don't quite follow why scraping would be different than browser parsing
2021-04-23 18:25:11 <wroathe> monochrom: As far as I see, you'd want your parser to handle multiple encodings, regardless
2021-04-23 18:25:43 <wroathe> monochrom: I'll likely "assume" utf-8, until I see evidence that they want something else.
2021-04-23 18:25:49 <monochrom> "never decode" counts as succeeding with multiple encodings, including even invalid encodings?
2021-04-23 18:26:11 <wroathe> monochrom: "never decode"?
2021-04-23 18:26:40 nut` joins (~user@roc37-h01-176-170-197-243.dsl.sta.abo.bbox.fr)
2021-04-23 18:27:03 <monochrom> Suppose I just want to look for <span class="foo">...</span> and save the "..." in a file and let someone else interpret the "..."?
2021-04-23 18:27:25 <wroathe> monochrom: Well, even attribute values would be subject to encoding, would they not?
2021-04-23 18:27:35 <wroathe> monochrom: (I'm in the process of reading the spec on this as we speak)
2021-04-23 18:28:03 <monochrom> In practice people stick to ascii for class="foo".
2021-04-23 18:28:06 <wroathe> monochrom: But even locating the sequence <span would be subject to encoding
2021-04-23 18:28:13 × nut quits (~gtk@roc37-h01-176-170-197-243.dsl.sta.abo.bbox.fr) (Ping timeout: 252 seconds)
2021-04-23 18:28:24 <wroathe> monochrom: Even if multiple encodings share that same byte sequence
2021-04-23 18:28:30 × nut` quits (~user@roc37-h01-176-170-197-243.dsl.sta.abo.bbox.fr) (Remote host closed the connection)
2021-04-23 18:28:44 <wroathe> monochrom: Sure, but in practice most people use utf-8, and so I wouldn't even need to handle other encodings

All times are in UTC.