MetaBrainz

Room ID: !hyuEWFORfWYLZitABm:chatbrainz.org

_BrainzGit (IRC)
[musicbrainz-server] reosarevok merged pull request #3544 (master…MBS-14024): MBS-14024: Show genre rels on more places https://github.com/metabrainz/musicbrainz-server/pull/3544
BrainzBot (IRC)
MBS-14024: Artist-Genre relationships not visible on artist page https://tickets.metabrainz.org/browse/MBS-14024
_BrainzGit (IRC)
[metabrainz.org] fettuccinae opened pull request #508 (metabrainz-notifications…metabrainz-notifications): Add notification/send endpoint. https://github.com/metabrainz/metabrainz.org/pull/508
_BrainzGit (IRC)
[musicbrainz-server] reosarevok merged pull request #3543 (master…MBS-14021): MBS-14021: Report for pseudo-releases with cover art https://github.com/metabrainz/musicbrainz-server/pull/3543
BrainzBot (IRC)
MBS-14021: Report: pseudo-releases with cover art https://tickets.metabrainz.org/browse/MBS-14021
m.amanullah7

hey lucifer i'm having trouble in verifying OAuth2 implementation as well as i cant validate api becoz funkwhale setup when i tried to login with superuser funkwhale i'm getting access denied

how can fix this?
should i push this commit ?

reosarevok
yvanzo: I suspect SEARCH-745 is an oversight, but can you confirm it's not intentional?
BrainzBot (IRC)
SEARCH-745: Aliases in search responses lack "ended" data https://tickets.metabrainz.org/browse/SEARCH-745
lucifer
yes you can push your changes and I'll try to test it locally
m.amanullah7
Sure thanks!
_BrainzGit (IRC)
[musicbrainz-server] reosarevok opened pull request #3556 (master…MBS-14050): MBS-14050: Add tests for find_best_primary_alias https://github.com/metabrainz/musicbrainz-server/pull/3556
BrainzBot (IRC)
MBS-14050: Add tests for find_best_primary_alias https://tickets.metabrainz.org/browse/MBS-14050
reosarevok
derat: if you want to take a quick look at some point in case you can think of more test cases, it'd be appreciated :) ^
_BrainzGit (IRC)
[listenbrainz-server] RayyanSeliya opened pull request #3295 (master…add-ia-indexer): Add Internet Archive indexer for metadata cache https://github.com/metabrainz/listenbrainz-server/pull/3295
rayyan_seliya123
Hey @lucifer:chatbrainz.org: just added my indexer script for your review let me know if any changes needed I tried to run the script but it was giving me redis connection errors even my docker was running used ./develop.sh build and./develop.sh up and running python -m listenbrainz.metadata_cache.internetarchive.ia_indexer 🙂
_BrainzGit (IRC)
[metabrainz.org] mayhem merged pull request #507 (metabrainz-notifications…notification-table): Add fetch, mark and delete endpoints for notifications https://github.com/metabrainz/metabrainz.org/pull/507
m.amanullah7
lucifer: Done u can check now! and do let me know if any changes! its untested from my end i hope it dont have much errors or issues!!
https://github.com/metabrainz/listenbrainz-server/pull/3289
state_event = OtherState { state_key: "@novchar:matrix.org", content: RoomMember(Original { content: RoomMemberEventContent { avatar_url: None, displayname: None, is_direct: None, membership: "leave", third_party_invite: None, blurhash: None, reason: None, join_authorized_via_users_server: None }, prev_content: Some(RoomMemberEventContent { avatar_url: None, displayname: Some("november"), is_direct: None, membership: "join", third_party_invite: None, blurhash: None, reason: None, join_authorized_via_users_server: None }) }) }
mayhem
(.ve) blackbox:~/metabrainz/faster-fuzzy/build->./mapping_server ../index/ "Kikagaku Moyo" nana "masana temples" ARTIST SEARCH: 'Kikagaku Moyo' (kikagakumoyo) kikagakumoyo 1301284 1.00 kikagakumoyojihexuemoyang 1678020 0.70 kikagakumoyojihexuemoyang 2707939 0.70 nagakumo 3600796 0.64 akumo 4303035 0.53 RELEASE SEARCH: 'masana temples' (masanatemples) masanatemples 1.00 RECORDING SEARCH: 'nana' (nana) nana 1.00 fetch metadata 1301284 2838103 23087156 1301284 3a605eba-b6a1-4298-855d-b3033df0bf8b Kikagaku Moyo 2838103 615b8e61-4be8-4385-a0d3-0894f91bfa6b Masana Temples 23087156 1bb37d6c-6eed-4294-8341-1efa4cdce0d8 Nana { "Kikagaku Moyo","masana temples","nana","3a605eba-b6a1-4298-855d-b3033df0bf8b","615b8e61-4be8-4385-a0d3-0894f91bfa6b","1bb37d6c-6eed-4294-8341-1efa4cdce0d8" }, (.ve) blackbox:~/metabrainz/faster-fuzzy/build->./mapping_server ../index/ "幾何学模様" nana "masana temples" ARTIST SEARCH: '幾何学模様' (jihexuemoyang) jihexuemoyang 1301284 1.00 jihexuemoyang 1230474 1.00 jihexuemoyang 1678020 1.00 jihexuemoyang 2707939 1.00 kikagakumoyojihexuemoyang 1678020 0.72 kikagakumoyojihexuemoyang 2707939 0.72 zhexue 2760996 0.51 RELEASE SEARCH: 'masana temples' (masanatemples) masanatemples 1.00 RECORDING SEARCH: 'nana' (nana) nana 1.00 fetch metadata 1301284 2838103 23087156 1301284 3a605eba-b6a1-4298-855d-b3033df0bf8b Kikagaku Moyo 2838103 615b8e61-4be8-4385-a0d3-0894f91bfa6b Masana Temples 23087156 1bb37d6c-6eed-4294-8341-1efa4cdce0d8 Nana { "幾何学模様","masana temples","nana","3a605eba-b6a1-4298-855d-b3033df0bf8b","615b8e61-4be8-4385-a0d3-0894f91bfa6b","1bb37d6c-6eed-4294-8341-1efa4cdce0d8" },
mayhem
monkey: ^^
zas
yamaoka was rebooted and the spammy message in logs tamed
monkey
mayhem
2025-06-05 15:24:13: artist index size: 344452391
mayhem
mayhem
where did my code invent all these artists from???
mayhem
oh. bytes. not rows. 😆
state_event = OtherState { state_key: "@_discord_122437682046435331:chatbrainz.org", content: RoomMember(Original { content: RoomMemberEventContent { avatar_url: Some("mxc://chatbrainz.org/jnWfCUjhWFlmHspItrWdaUoG"), displayname: Some("clearing"), is_direct: None, membership: "join", third_party_invite: None, blurhash: None, reason: None, join_authorized_via_users_server: None }, prev_content: None }) }
julian45
just out of curiosity, what is producing the "jihexuemoyang" in your output? seems like an attempt at a mandarin reading of the kanji in this artist name, which is not exactly useful here IMO since the kanji are for a japanese name
mayhem
well spotted, but not a cause for alarm. The package in question is unidecode that attempts to come up with a stable ASCII serialization for all "non ascii words". The point is never to put these in front of humans, since they are a ton of garbage.
mayhem
they are, however , quite useful for making search indexes, where the search results will need to be checked against the original strings and not the index garbage.
julian45
i see
julian45
makes sense
_BrainzGit (IRC)
[musicbrainz-server] reosarevok opened pull request #3557 (master…MBS-14046): MBS-14046: Use user time zone for anniversary message https://github.com/metabrainz/musicbrainz-server/pull/3557
BrainzBot (IRC)
MBS-14046: Anniversary message displayed too soon https://tickets.metabrainz.org/browse/MBS-14046
_BrainzGit (IRC)
[musicbrainz-server] reosarevok opened pull request #3558 (master…MBS-13983): MBS-13983: Load primary aliases in more places for autocomplete / ws::js https://github.com/metabrainz/musicbrainz-server/pull/3558
BrainzBot (IRC)
MBS-13983: Primary aliases not always shown on the release relationship editor https://tickets.metabrainz.org/browse/MBS-13983
lucifer
holycow23: Listens have msid so I will need a mapping of msid to the mbid, yes but they can also have mbids. there are multiple cases possible. listens with only msid - no mapped mbid, listens with msid - mapped to a mbid. listens with a msid and user specified mbid. rest is correct.
holycow23
I am talking about the moment when a user plays a song, it comes in as a listen just with the msid right, then it is mapped to the correspoding mbid
lucifer
@holycow23:matrix.org it's assigned a msid and we try to map it to a mbid but a match may not be found always.
holycow23
Okay
_BrainzGit (IRC)
[metabrainz.org] amCap1712 opened pull request #509 (master…client-credentials-grant): oauth: add client credentials grant https://github.com/metabrainz/metabrainz.org/pull/509
lucifer
mayhem: @fettuccinae:matrix.org will need this for the auth parts of his project at some point ^
mayhem
holycow23

lucifer: I have a code snippet below, so these are the queries being used right now in the listening_activity stats

    def get_aggregate_query(self, table):
        return f"""
            SELECT user_id
                 , date_format(listened_at, '{self.spark_date_format}') AS time_range
                 , count(listened_at) AS listen_count
              FROM {table}
          GROUP BY user_id
                 , time_range
        """

def get_table_prefix(self) -> str:
    return f"{self.entity}_listeners_{self.stats_range}"

table = f"{self.provider.get_table_prefix()}_full_listens"
full_query = self.provider.get_aggregate_query(table)
full_df = run_query(full_query)

On local I would access the listens in TimeScaleDB by the table namelistens with a filter of the time frame for the stats_range but the above code is a does that differently, is that just for prod or am I going wrong somewhere?

(edited)
holycow23
*

lucifer: I have a code snippet below, so these are the queries being used right now in the listening_activity stats

    def get_aggregate_query(self, table):
        return f"""
            SELECT user_id
                 , date_format(listened_at, '{self.spark_date_format}') AS time_range
                 , count(listened_at) AS listen_count
              FROM {table}
          GROUP BY user_id
                 , time_range
        """
def get_table_prefix(self) -> str:
    return f"{self.entity}_listeners_{self.stats_range}"

        table = f"{self.provider.get_table_prefix()}_full_listens"
        full_query = self.provider.get_aggregate_query(table)
        full_df = run_query(full_query)

On local I would access the listens in TimeScaleDB by the table namelistens with a filter of the time frame for the stats_range but the above code is a does that differently, is that just for prod or am I going wrong somewhere?

holycow23
*

lucifer: I have a code snippet below, so these are the queries being used right now in the listening_activity stats

    def get_aggregate_query(self, table):
        return f"""
            SELECT user_id
                 , date_format(listened_at, '{self.spark_date_format}') AS time_range
                 , count(listened_at) AS listen_count
              FROM {table}
          GROUP BY user_id
                 , time_range
        """

def get_table_prefix(self) -> str:
    return f"{self.entity}_listeners_{self.stats_range}"

        table = f"{self.provider.get_table_prefix()}_full_listens"
        full_query = self.provider.get_aggregate_query(table)
        full_df = run_query(full_query)

On local I would access the listens in TimeScaleDB by the table namelistens with a filter of the time frame for the stats_range but the above code is a does that differently, is that just for prod or am I going wrong somewhere?

holycow23
*

lucifer: I have a code snippet below, so these are the queries being used right now in the listening_activity stats

    def get_aggregate_query(self, table):
        return f"""
            SELECT user_id
                 , date_format(listened_at, '{self.spark_date_format}') AS time_range
                 , count(listened_at) AS listen_count
              FROM {table}
          GROUP BY user_id
                 , time_range
        """

def get_table_prefix(self) -> str:
    return f"{self.entity}_listeners_{self.stats_range}"

table = f"{self.provider.get_table_prefix()}_full_listens"
full_query = self.provider.get_aggregate_query(table)
full_df = run_query(full_query)

On local I would access the listens in TimeScaleDB by the table namelistens with a filter of the time frame for the stats_range but the above code is a does that differently, is that just for prod or am I going wrong somewhere?

lucifer
holycow23: if you are running it on spark then it doesn't involve timescaledb at all. but yes this particular query could be run on timescaledb by accessing listens from listens table if you wanted to.
lucifer
we don't run stats queries on timescaledb at all to avoid slowing down the rest of the website.
holycow23
So is data being sent over from timescaleDB to Spark in those specific table format?
lucifer
the data is sent in a parquet files, for instance you can download this dump https://data.metabrainz.org/pub/musicbrainz/listenbrainz/incremental/listenbrainz-dump-2156-20250605-000003-incremental/listenbrainz-spark-dump-2156-20250605-000003-incremental.tar, extract it and get parquet files.
lucifer
you can load those parquet files in pandas or duckdb etc and explore their content.
holycow23
Okay will look into it
lucifer
the schema is spark terms is defined at: https://github.com/metabrainz/listenbrainz-server/blob/f9c919822105d14a82cc2f2abff7fa235752eae7/listenbrainz_spark/schema.py#L36 but browsing the file with pandas/tool of your choice might help make it clearer
_BrainzGit (IRC)
[listenbrainz-server] MonkeyDo opened pull request #3296 (master…bootstrap-fixes): Fix Bootstrap5 migration issues https://github.com/metabrainz/listenbrainz-server/pull/3296
state_event = OtherState { state_key: "@_discord_212388900616798218:chatbrainz.org", content: RoomMember(Original { content: RoomMemberEventContent { avatar_url: Some("mxc://chatbrainz.org/DsdoHxemZnrGBhdjoUXTlwCL"), displayname: Some("rostiku"), is_direct: None, membership: "join", third_party_invite: None, blurhash: None, reason: None, join_authorized_via_users_server: None }, prev_content: None }) }
kellnerd
lucifer
rayyan_seliya123: did you run the python -m listenbrainz.metadata_cache.internetarchive.ia_indexer command inside the docker container or outside?
lucifer
try running ./develop.sh exec web python -m listenbrainz.metadata_cache.internetarchive.ia_indexer
rayyan_seliya123
I think outside !
rayyan_seliya123
Okk let me try this or should i also need to run ./develop.sh build and up ?
lucifer
rayyan_seliya123: your docker containers need to up when you run this so if they are not already running you need to run ./develop.sh up yes.
lucifer
if your docker containers are running already then just run the exec command.
rayyan_seliya123
Okk I am building and running it
lucifer
the code inside listenbrainz directory is volume mapped to the container so it will update automatically. you need to run build only if you add a python or javascript dependency.
rayyan_seliya123
Sure !
rostiku
Hi guys, can I ask a question here about track filtering?
rayyan_seliya123
image.png
rayyan_seliya123
i am getting this after build and up and that command
lucifer
you need to import it as listenbrainz.metadata_cache.internetarchive
lucifer
sure
rostiku
hey lucifer
rostiku
this question isnt really about metabrainz, but musicbrainz database structure
lucifer
musicbrainz dev team is in the channel so they can answer your questions.
rostiku
I see musicbrainz categorizes albums into release_group entries, and then a release_group can have many releases. I'm looking for a way to filter for the original release in a release_group. As I can see that some releases belonging to a release_group can be versions from Japan / Europe, which I am not interested in, but they are still marked as "official" in release_status
lucifer
rostiku
I would come up with a sketchy way to do this myself, but looking to see an expert opinion :P
lucifer
i don't think there is a concept of the original release but it should be possible to find the earliest release.
rostiku
thats kind of what I'm thinking, it looks like the "original" release would be the one that came first, and since we have release dates, we can filter out that way
reosarevok
That sounds about right
reosarevok
Feel free to just filter by earliest if you prefer. Selecting by min date would work probably?
rostiku
well they can also sometimes share dates
bitmap
you can join with release_first_release_date to do that
rostiku
release_first_release_date is a table?
bitmap
if you want to prioritize certain countries, too, you can join with the release_event view
bitmap
(or just the release_country table if you want to exclude releases without any country set)
bitmap
yes, and it has to be built first
bitmap
./admin/BuildMaterializedTables release_first_release_date
rostiku
cool, I will look into it, thanks guys
Jade
re that DMARC report ticket by the way, there's a fair chance that it isn't our issue at all and is in fact someone else trying to send email from the metabrainz domain
Jade
I've got DMARC reports coming in on a domain which I know isn't sending any emails now, for example
Jade
Also julian45 you mentioned when we were talking that it was HAProxy? Got sent this today, perhaps it'll be useful https://progress.opensuse.org/news/125
Jade
They ended up using https://github.com/DropMorePackets/berghain
julian45
thx, will have a look at these!
derwin (IRC)
hi. I am finally getting around to writing a bot for Remix relationships. and by "I" of course I mean vibe coding by Claude.ai. any advice on whether I should do one bot like murdos_bot or one bot per function or etc?
rayyan_seliya123
In reply to lucifer
you need to import it as listenbrainz.metadata_cache.internetarchive
@lucifer:chatbrainz.org: This didn't resolved my issue it was giving me the same error as module not found i retain that import internetarchive the problem was internetarchive was not installing on my local environment after trying multiple ./develop.sh build web successfully got installed the 5.4.0 version and after that i tried again ./develop.sh exec web python -m listenbrainz.metadata_cache.internetarchive.ia_indexer It just started indexing and then stopped giving me array of errors u can see in my gist https://gist.github.com/RayyanSeliya/8a7eda74ec3322526b20bf26c1b9b76f , I think I need help to align the indexer code with all the required listenbrainz fields to run it properly ! (edited)
rayyan_seliya123
* lucifer: This didn't resolved my issue it was giving me the same error as module not found i retain that import internetarchive the problem was internetarchive was not installing on my local environment after trying multiple ./develop.sh build web successfully got installed the 5.4.0 version and after that i tried again ./develop.sh exec web python -m listenbrainz.metadata_cache.internetarchive.ia_indexer It just started indexing and then stopped giving me array of errors u can see in my gist https://gist.github.com/RayyanSeliya/8a7eda74ec3322526b20bf26c1b9b76f , I think I need help to align the indexer code with all the required listenbrainz fields to run it properly !
rayyan_seliya123
* @lucifer:chatbrainz.org: This didn't resolved my issue it was giving me the same error as module not found i retain that import internetarchive the problem was internetarchive was not installing on my local environment after trying multiple ./develop.sh build web successfully got installed the 5.4.0 version and after that i tried again ./develop.sh exec web python -m listenbrainz.metadata_cache.internetarchive.ia_indexer It just started indexing and then stopped giving me array of errors u can see in my gist https://gist.github.com/RayyanSeliya/8a7eda74ec3322526b20bf26c1b9b76f , I think I need help to align the indexer code with all the required listenbrainz fields to run it properly !
Aerozol
image.png
Aerozol
🥰