Logs: freenode/#haskell
| 2021-04-25 00:08:10 | <wz1000> | I like foldl, but I don't think a naive translation to streaming would buy you much |
| 2021-04-25 00:08:26 | <wz1000> | since there is the sort in the end which would need to materialize all items |
| 2021-04-25 00:08:30 | <shapr> | wz1000: count min sketch has a monoid instance |
| 2021-04-25 00:08:57 | × | Tario quits (~Tario@201.192.165.173) (Read error: Connection reset by peer) |
| 2021-04-25 00:09:12 | <shapr> | That or a counting bloom filter are the closest things I know to a streaming hashed data structure |
| 2021-04-25 00:09:12 | × | merijn quits (~merijn@83-160-49-249.ip.xs4all.nl) (Ping timeout: 240 seconds) |
| 2021-04-25 00:09:51 | <shapr> | Hm, good point about not being able to ask for the duplicates |
| 2021-04-25 00:11:22 | → | chimera joins (~chimera@168-182-134-95.pool.ukrtel.net) |
| 2021-04-25 00:14:40 | <shachaf> | Does constant memory actually matter? |
| 2021-04-25 00:15:43 | <shachaf> | You probably won't be running this on more than, say, 10M files. |
| 2021-04-25 00:16:23 | → | raehik joins (~raehik@cpc95906-rdng25-2-0-cust156.15-3.cable.virginm.net) |
| 2021-04-25 00:17:34 | <wz1000> | shapr: you can use `listDirectory` instead of `getDirectoryContents` and filtering out "." and ".." |
| 2021-04-25 00:18:06 | <wz1000> | a couple of `unsafeInterleaveIO`s might also be a tiny improvement in that function without the overhead of a full streaming library |
| 2021-04-25 00:22:31 | × | acidjnk_new quits (~acidjnk@p200300d0c72b958148671e179a62db06.dip0.t-ipconnect.de) (Ping timeout: 250 seconds) |
| 2021-04-25 00:23:06 | <shapr> | shachaf: I don't know, but now I have the urge to do heap profiling |
| 2021-04-25 00:23:17 | <shapr> | wz1000: ah good idea! |
| 2021-04-25 00:23:19 | → | elvishjerricco joins (~elvishjer@NixOS/user/ElvishJerricco) |
| 2021-04-25 00:23:26 | → | Tario joins (~Tario@201.192.165.173) |
| 2021-04-25 00:24:09 | × | stef204 quits (~stef204@unaffiliated/stef-204/x-384198) (Quit: WeeChat 3.1) |
| 2021-04-25 00:25:11 | × | justanotheruser quits (~justanoth@unaffiliated/justanotheruser) (Ping timeout: 260 seconds) |
| 2021-04-25 00:26:36 | → | ddellacosta joins (ddellacost@gateway/vpn/mullvad/ddellacosta) |
| 2021-04-25 00:27:09 | → | AsL joins (~asl@91.207.86.95) |
| 2021-04-25 00:27:17 | × | Tuplanolla quits (~Tuplanoll@91-159-68-239.elisa-laajakaista.fi) (Quit: Leaving.) |
| 2021-04-25 00:28:14 | × | Deide quits (~Deide@217.155.19.23) (Quit: Seeee yaaaa) |
| 2021-04-25 00:31:11 | × | ddellacosta quits (ddellacost@gateway/vpn/mullvad/ddellacosta) (Ping timeout: 240 seconds) |
| 2021-04-25 00:32:52 | × | AsL quits (~asl@91.207.86.95) () |
| 2021-04-25 00:34:35 | <shapr> | I wonder if a streaming library could work by holding a list of FilePath and hash at position? That would give you constant memory (per file) and still find duplicates? |
| 2021-04-25 00:35:57 | × | shailangsa quits (~shailangs@host86-185-58-137.range86-185.btcentralplus.com) () |
| 2021-04-25 00:36:24 | × | DavidEichmann quits (~david@147.136.46.217.dyn.plus.net) (Ping timeout: 252 seconds) |
| 2021-04-25 00:39:20 | × | nut quits (~gtk@roc37-h01-176-170-197-243.dsl.sta.abo.bbox.fr) (Ping timeout: 252 seconds) |
| 2021-04-25 00:39:27 | → | zmijunkie1 joins (~Adium@87.122.217.64) |
| 2021-04-25 00:40:17 | → | zebrag joins (~inkbottle@aaubervilliers-654-1-79-166.w86-212.abo.wanadoo.fr) |
| 2021-04-25 00:40:27 | × | chimera quits (~chimera@168-182-134-95.pool.ukrtel.net) (Quit: Konversation terminated!) |
| 2021-04-25 00:41:01 | × | bennofs_ quits (~quassel@dynamic-089-012-022-232.89.12.pool.telefonica.de) (Ping timeout: 252 seconds) |
| 2021-04-25 00:41:02 | <shapr> | Is there a comparison of the streaming libraries in Haskell? |
| 2021-04-25 00:41:19 | → | bennofs_ joins (~quassel@dynamic-077-013-095-012.77.13.pool.telefonica.de) |
| 2021-04-25 00:42:11 | × | zmijunkie quits (~Adium@87.122.222.145) (Ping timeout: 240 seconds) |
| 2021-04-25 00:42:21 | × | Tario quits (~Tario@201.192.165.173) (Read error: Connection reset by peer) |
| 2021-04-25 00:43:34 | → | nineonine joins (~nineonine@50.216.62.2) |
| 2021-04-25 00:47:05 | × | bitmagie quits (~Thunderbi@200116b806e24c00585f6c775438f3bb.dip.versatel-1u1.de) (Quit: bitmagie) |
| 2021-04-25 00:47:52 | × | nineonine quits (~nineonine@50.216.62.2) (Ping timeout: 240 seconds) |
| 2021-04-25 00:48:42 | <zebrag> | Starting from the "free monoid" adjunction. The counit is on the "monoid" side: it does have the information concerning the law of the monoid, it really can do the sum `[1,2,3]=1+2+3=6`. Now I want to consider the related "comonad". But I fail on "right counitality", where I get `[1,2,3]` mapped to `[6]`, where it should be the identity. I don't understand why: starting from an adjunction I should get the comonad, not |
| 2021-04-25 00:48:43 | <zebrag> | sweat? What did I missed? |
| 2021-04-25 00:49:25 | <zebrag> | The left counitality is okay, but the right counitality is really bothering me. |
| 2021-04-25 00:49:37 | × | raehik quits (~raehik@cpc95906-rdng25-2-0-cust156.15-3.cable.virginm.net) (Ping timeout: 252 seconds) |
| 2021-04-25 00:50:52 | × | is_null quits (~jpic@pdpc/supporter/professional/is-null) (Ping timeout: 240 seconds) |
| 2021-04-25 00:51:37 | <zebrag> | Maybe I'll get the answer there: http://blog.higher-order.com/blog/2015/10/12/freedom-and-forgetfulness/ |
| 2021-04-25 00:51:41 | <zebrag> | maybe |
| 2021-04-25 00:52:06 | <shachaf> | shapr: Of course a Haskell program will tend to use memory pretty inefficiently unless you're very careful. |
| 2021-04-25 00:52:23 | <zebrag> | (not speaking scala won't help) |
| 2021-04-25 00:52:25 | → | is_null joins (~jpic@pdpc/supporter/professional/is-null) |
| 2021-04-25 00:52:29 | → | olligobber joins (olligobber@gateway/vpn/privateinternetaccess/olligobber) |
| 2021-04-25 00:53:17 | → | Tario joins (~Tario@201.192.165.173) |
| 2021-04-25 00:56:27 | × | thc202 quits (~thc202@unaffiliated/thc202) (Ping timeout: 258 seconds) |
| 2021-04-25 00:59:08 | × | zebrag quits (~inkbottle@aaubervilliers-654-1-79-166.w86-212.abo.wanadoo.fr) (Quit: Konversation terminated!) |
| 2021-04-25 00:59:26 | → | zebrag joins (~inkbottle@aaubervilliers-654-1-79-166.w86-212.abo.wanadoo.fr) |
| 2021-04-25 01:00:42 | <edwardk> | zebrag: let's start with what the free monoid adjunction is. we have some kind of forgetful mapping U from the category of monoids with monoid homomorphisms as arrows to the category of sets where the arrows are just arbitrary functions. and a 'free' monoid' would be a functor F that is left adjoint to this. F -| U. that is to say every monoid homomorphism F a -> b is in one to one correspondence with every set homomorphism |
| 2021-04-25 01:00:42 | <edwardk> | (function) from a -> U b. in haskell U b is boring it is just 'b'. |
| 2021-04-25 01:00:44 | × | mrchampion quits (~mrchampio@38.18.109.23) (Read error: Connection reset by peer) |
| 2021-04-25 01:01:17 | → | ddellacosta joins (~ddellacos@ool-44c73afa.dyn.optonline.net) |
| 2021-04-25 01:01:24 | <edwardk> | now we can look at this isomorphism, and kind of prod at it to see what it must do. |
| 2021-04-25 01:01:40 | <zebrag> | (hi) |
| 2021-04-25 01:02:27 | <edwardk> | we know that monoid homomorphisms from F a -> b are in one to one correspondence with functions from a -> U b and we get to pick a and b. so what if we pick b = F a and just see where the identity arrow on "F a" goes? |
| 2021-04-25 01:03:08 | → | paddymahoney joins (~paddymaho@cpe9050ca207f83-cm9050ca207f80.cpe.net.cable.rogers.com) |
| 2021-04-25 01:03:28 | <edwardk> | we then get an arrow a -> U (F a) |
| 2021-04-25 01:03:40 | <edwardk> | this is the 'unit' of the adjunction or of the monad that comes from the adjunction. |
| 2021-04-25 01:03:49 | <zebrag> | yes |
| 2021-04-25 01:03:49 | <edwardk> | UF here is basically [] in haskell |
| 2021-04-25 01:04:00 | → | heatsink joins (~heatsink@108-201-191-115.lightspeed.sntcca.sbcglobal.net) |
| 2021-04-25 01:04:05 | → | ddellac__ joins (ddellacost@gateway/vpn/mullvad/ddellacosta) |
| 2021-04-25 01:04:27 | → | CelestiaIsTheWay joins (~sepples@sepples.xyz) |
| 2021-04-25 01:04:39 | <edwardk> | on the other hand we can look at the isomorphism the other way and pick a = U b. now we get F (U b) -> b as the counit of the adjunction or extract of the comonad. |
| 2021-04-25 01:05:02 | <zebrag> | ok |
| 2021-04-25 01:05:22 | <edwardk> | here we get a monad on Set (or Hask) and a comonad in the category of monoids |
| 2021-04-25 01:05:46 | × | ddellacosta quits (~ddellacos@ool-44c73afa.dyn.optonline.net) (Ping timeout: 252 seconds) |
| 2021-04-25 01:06:08 | <zebrag> | If I compose the lifted counit after the comultiplication, I must get the identity |
| 2021-04-25 01:06:37 | <zebrag> | but the lifted counit is executing the computation in [1,2,3] |
| 2021-04-25 01:06:53 | <zebrag> | returning 6 |
| 2021-04-25 01:07:17 | × | tsaka_ quits (~torstein@athedsl-4519432.home.otenet.gr) (Ping timeout: 246 seconds) |
| 2021-04-25 01:07:42 | → | mrchampion joins (~mrchampio@38.18.109.23) |
| 2021-04-25 01:07:50 | <zebrag> | lifted counit is executing the computation in [[1,2,3]]* |
| 2021-04-25 01:08:13 | → | malumore_ joins (~malumore@151.62.115.131) |
| 2021-04-25 01:08:27 | <zebrag> | the right counitality law say I should manage to get [1,2,3] |
| 2021-04-25 01:08:31 | × | ddellac__ quits (ddellacost@gateway/vpn/mullvad/ddellacosta) (Ping timeout: 252 seconds) |
| 2021-04-25 01:09:26 | <edwardk> | let's see. the duplicate/comultiplication for the comonad is basically fmaps unit over F to turn FU -> F[UF]U |
| 2021-04-25 01:10:30 | <zebrag> | (iiuc, on the monoid side, we not only have the elements, but also the law which must be used on them, attached to them) |
| 2021-04-25 01:10:31 | <edwardk> | your 'right counitality law' is the fmap extract . duplicate = id? |
| 2021-04-25 01:10:49 | <zebrag> | thinking hard... |
| 2021-04-25 01:10:59 | <zebrag> | yes |
| 2021-04-25 01:11:00 | <edwardk> | as opposed to extract . duplicate = id on the left? |
| 2021-04-25 01:11:08 | <zebrag> | correct |
| 2021-04-25 01:11:24 | × | whataday quits (~xxx@2400:8902::f03c:92ff:fe60:98d8) (Remote host closed the connection) |
| 2021-04-25 01:11:26 | × | malumore quits (~malumore@151.62.120.175) (Ping timeout: 240 seconds) |
| 2021-04-25 01:11:42 | <zebrag> | okay, you gave me the translation of the equation in haskell, good point |
| 2021-04-25 01:12:32 | → | whataday joins (~xxx@2400:8902::f03c:92ff:fe60:98d8) |
| 2021-04-25 01:12:50 | <edwardk> | except remember, here extract and duplicate are in the category of monoids. so they must each be monoid homomorphisms |
| 2021-04-25 01:13:02 | <zebrag> | correct |
| 2021-04-25 01:13:29 | <zebrag> | so extract really knows about the natural number law |
| 2021-04-25 01:13:36 | <zebrag> | 1 + 2 = 3 |
| 2021-04-25 01:13:50 | <edwardk> | natural number law being the particular monoid you want to reduce with? |
All times are in UTC.