PDA

View Full Version : "Confidence:1" after _second_ submitted result, yes?



Gew
01-21-2010, 10:49 PM
Hi!

Sorry to bother you guys again.
It's probably in some thread here but I'm so tired, and this is bugging me. I think I have all the pieces to the puzzle. Now, only this. As I've understood it, when there is no AccurateRip record for a track, when the very first user submits it, it doesn't get "confidence:1", due to the high level of uncertainty, plus the fact that all submitted misreads would be "strolling around" the database for no good at all. So, when a user submits a CRC for the first time, it gets sort of "in-pipeline-waiting-for-further-verifications", ie confidence:0.


In short:
When a ripped track gives me "Confidence:1" it means that 2 unique users/computers has ripped the track in particular, with the same checksum result. "Confidence:29" means 30 unique submitts for track, etc, etc, yes?


Like I said, sorry to bother you about this. The more I think about it, it seems like this is the only real answer to the question, cus otherwise it would - like I thought - lots of messy garbage CRC's in AccurateRip. Just need a tiny confirmation on this, hopefully from Spoon, so that I can find inner piece.

Regards~
And bigups for all the good work~

Spoon
01-22-2010, 06:20 AM
> "Confidence:1" it means that 2 unique users/computers has ripped the track in particular

Not always, if there is only one pressing of a CD then a single submission would be made available.

garym
01-22-2010, 09:11 AM
In terms of piece of mind, even the fact that one other user has ripped the CD on a different machine, different drive, different physical CD, and still produced the exact CRCs you are getting in your rip is still a good thing (and better than ripping twice on your own machine and comparing the results).

eaglescout1998
01-22-2010, 09:44 AM
A confidence number of only 1 is not a guarantee of an accurate rip. Unless you have just purchased the disc and are ripping it for the very first time, it is possible that you are comparing with your own previous rip (and not someone else's).

garym
01-22-2010, 11:58 AM
A confidence number of only 1 is not a guarantee of an accurate rip. Unless you have just purchased the disc and are ripping it for the very first time, it is possible that you are comparing with your own previous rip (and not someone else's).

Hmmm, there is quite a delay between ripping a disc, and my OWN results being added to the accuraterip database. For example, if I rip the disk, then rip it again tomorrow (or even next week), my initial rip would NOT yet be in the database. You are correct that if I ripped it, and then a year later I ripped it again, the accruaterip database might only contain my prior rip. For me this is not an issue, because I rip my discs only once (to FLAC) with very rare exceptions.

Gew
01-22-2010, 01:06 PM
First, thanks for all your answers.

And yes, Spoon has previously stated that AccurateRip database is in fact updated approx. once a month, therefor "..then rip it again tomorrow (or even next week), my initial rip would NOT yet be in the database.." is correct.

I think this cleared it though. When a track (of disc) is ripped for the first time, it gets Confidence:1 instantly (but added to database for search queries first in a month or so).

The only thing I find strange is, assuming there are lots of folks who rip scratched discs and get faulty CRCs; and then submit those, there would be tons of "garbage CRCs" with Confidence:1 of, well, lots of albums/pressings out there. Right? This is why I thought that a track/pressing is getting C:1 first after being "validated" by a second submission. But I take it this is not the case then. Hmm..

Oh, perhaps I got a clue after all.

Look, do you (Spoon) mean that the very first CRC submitter of a Cuesheet-->TrackNo gets added after only single submission, meanwhiles any submission (with different CRC) after this original one requires at least one subsequent validation to get Confidence:1? This would sort of narrow down the number of -- possible -- faulty scratch garbage CRCs to 1 for each pressing, correct?

In short:
A single submission of a new crc (pressing) on a cuesheet (album) is indeed (after ~a month) added with Confidence:1.
Second submission gets Confidence:2, etc, etc..
?

Spoon
01-23-2010, 10:19 AM
Scratches are removed from the database when two people verify the rip, the 3rd scratch result would not make to the database.

Gew
01-23-2010, 11:14 PM
Scratches are removed from the database when two people verify the rip, the 3rd scratch result would not make to the database.

I'm reading the line over and over, trying to visualize the process. I think I may be complicating it to myself. Examples are nice though, so here goes.

I rip a track from a disc, EAC gives me the "OK [1234ABCD]" but also states that "track is not in database". Then, is the track now added to database "queue", so that when I rip the track (submit it) with the same system roughly a month later, I get Confidence:1 after ripping, after only one single submission..?

Then, say I have another pressing of the same album (belonging to the same cuesheet "folder" in database). I pop the disc and rip the same track. Now, I assume I would get a notice with i.e. "Track could not be validated [4321DCBA]" plus "AR returns: [1234ABCD]".

It's really around here my confusion starts.Say I then rip the second pressing again from another computer. A month later, any ripping of this disc track would give me "Track validated [4321DCBA]".

This is where you tell me that the original crc (1234ABCD) is wiped out of database, due to only Confidence:1. So, from here on, that second pressing is the only one in database for "cuesheet folder" in db, yes?

Then, let's say I rip that original disc track again, from another computer than before, then you tell me that the Confidence:1 was wiped out of db as two people (computers*) verified the ECBA4321 pressing.

So, even though the earlier submission of ABCD1234's, it's now back on Confidence:1. However, now both ABCD1234 & DCBA4321 checksums exist, side by side, in that [my own term of expression] "cuesheet folder" in database.

So, have I gotten it right this far?

Now, finally.

Take the possibility that I get a hold of a 3rd pressing of the album. I rip it aswell. After ripping I get i.e. "Could not be verified [11223344], AR returned [either 1234ABCD or DCBA4321]". I submit my rip.

Also, one thing that seems a bit mysterious.
Picture this, from scratch; e.g. blank sheet for TOC in database.

ABCD1234 is ripped and submitted. (1x)
DCBA4321 is ripped and submitted. (1x)
11223344 is ripped and submitted (1x).

All three pressings are ripped but none of them are (yet) verified. Will all three be present with Confidence:1, until..? Say I submitt one of them, for example DCBA4321 a second time (weee; validation!). Will this validation then "knock out" both ABCD1234 and 11223344 completely?

Ty for your patience.
Regards~

Spoon
01-24-2010, 04:37 AM
Forget different pressings for now, it complicates:

A guy rips a disc and gets an incorrect rip as [BBBBBBB] (because of scratch), it submits and this result is in the db. You rip correctly get [AAAAAAAA], submit, now neither results are in the db, 3rd guy rips and submits [AAAAAAAA], now only [AAAAAAAA] appears.

Gew
01-24-2010, 10:50 AM
Okey!


...now only [AAAAAAAA] appears.

It appears with Confidence:1 then? Or does it somehow go directly from complete db absence to Confidence:2?


Also, just so that I've gotten it right.
Assuming complete blank sheet in db now..
If a guy rips [AAAAAAAA], and the next guy rips [AAAAAAAA] aswell, Confidence leaps from 1 to 2, simple as that.

_whereas_

If a guy rips [AAAAAAAA], and the next guy gets a different checksum (i.e. [BBBBBBBB]), then [AAAAAAAA] (and [BBBBBBBBB]) is wiped. In short, database is always looking for -- subsequent -- similar checksums, and to prevent garbage sums it keeps wiping only-once-occurring sums.

Correct?

Spoon
01-24-2010, 03:06 PM
100% correct, it is a simple concept which works well.

Gew
01-25-2010, 12:57 PM
Nice!

So, you could say that, not only that it's not of very high reliability, but tracks you get "Confidence:1" on is always in sort of a "hang-loose state", meaning they could be wiped at anytime. Whereas tracks from which you get Confidence:2,3,4,5... is more or less "permanent" in database. Correct?

Spoon
01-25-2010, 03:43 PM
Correct.

Gew
01-26-2010, 04:44 PM
Forget different pressings for now, it complicates:

A guy rips a disc and gets an incorrect rip as [BBBBBBB] (because of scratch), it submits and this result is in the db. You rip correctly get [AAAAAAAA], submit, now neither results are in the db, 3rd guy rips and submits [AAAAAAAA], now only [AAAAAAAA] appears.

This will -- seriously -- be the last piece of this -- much neurotic (I know) -- "scenario puzzle" that has been bugging me. Your process rendering above is very good. However, you verified that C:1 results are in a hang-loose state, whereas results of any higher confidence level is kinda stuck like glue in db.

Let's assume your scenario above, but let's include (another) rip crc with a Confidence of 100 or so, in between [A..] and , call this one [C..] for simplicity.

Is this "ignored" (but just added: Confidence:101) in between the process of wiping subsequent 1-submit-only crc's?


[B]Example:

1. A guy rips a disc and gets an incorrect rip as [BBBBBBB] (because of scratch), it submits and this result is in the db (Confidence:1).

2. A guy rips a disc and gets a correct rip as [CCCCCCCC] (Confidence:100), it submits and this result in the db (Confidence:101).

3. You rip correctly get [AAAAAAAA], submit (Condidence:1).

Now, will -- still -- C:1 rips (no. 1 & 3) get wiped after submitting no. 3? Even though there has come a well validated rip crc in between..?

Spoon
01-26-2010, 05:26 PM
1 and 3 would only appear after another person verifies.

Gew
01-26-2010, 06:36 PM
1 and 3 would only appear after another person verifies.

But it is there, somwhere in db, although it doesn't appear. It just waits for a second submitter, right?

However, I've ripped some disc that gave me only "Track ripped OK! [CRC] (Condience:1". How do I know when Confidence:1 means that only one person has submitted the track, or that it has in fact been verified by another?

Sigh, it's still pretty confuzing. Just as I thought I had it right, this wonder-cloud..

In what case (any at all??) does Confidence:1 mean that only one person has submitted a track? Or does tracks submitted by one person only (no 2nd submit/verification) "hide" from ppl ripping other crc in AR db?

Gew
01-26-2010, 06:40 PM
Perhaps not completely out in the blue.

Is it so that AccurateRip somehow "remembers" no. 1 & 3 (eg. the fact that they have been submitted one time each), so that they won't have to appear in (two) subsequent rips/submits to come..? That would change the fact that AR consistently looks for subsequent crcs.

Spoon
01-27-2010, 04:34 AM
There is the published database and raw database, the raw database contains all submissions.

Gew
01-27-2010, 11:52 AM
There is the published database and raw database, the raw database contains all submissions.

Aaah. That explains a lot. So, it is also so that ripped track results are never displayed as Confidence:1 (in published database) until they are verified by at least one other (so that given checksum has had a total of 2 submissions)...?

Gew
01-27-2010, 01:29 PM
Aaah. That explains a lot. So, it is also so that ripped track results are never displayed as Confidence:1 (in published database) until they are verified by at least one other (so that given checksum has had a total of 2 submissions)...?

In short:
Is it -- without exception? -- so that a track is required to have more than one submission to move from raw db to published db?

Spoon
01-27-2010, 03:19 PM
Correct

Gew
01-27-2010, 07:14 PM
Correct

Marvelous! I think what made me confuzed earlier was how I read in some thread that "tracks that give C:1 cannot be trusted, since it could pretty much just be your own rip that has entered db". Then of course, you've earlier explained how published db is updated once a ~month, so it would me then still be pretty far out the edge, but I guess I pretty much brooded my head into that, which complicated to understand concept. Getting to know the distinct difference in raw db and published db was a great relief, making everything so much easier. Ofc, if I'd found this thread (http://www.hydrogenaudio.org/forums/index.php?showtopic=53150) earlier, I would have known about this wo/ having to ask.

Anyways, one last thing. Confidence:1. I could pretty much swear that I have gotten Confidence:1 on some of the CDs that I have ripped. How does this add upp with -- quoting from that old thread -- "..When two or more results for individual track match, the data is moved into 'confirmed' database.." ?

Do you understand my mind flow right now? :P

We've concluded that at least two submissions must be for a crc to go to published db. Then getting the rip result "Confidence:1" must be.. impossible? I might add I'm not 110% sure that I've gotten this result, but I _think_ that I got it once when I was ripping. Just to be clear.

Theoretical idea:
Is it so that when a track is being submitted for the very first time (discID has no previous track crcs in db), the crc will go directly into published db (as C:1)..? And then that process of wiping both _if_ the next one submitted doesn't match eg.)..?

This would explain everything.
But I'm not entirely sure.
Only you know Spoon.
Am I close?

Gew
01-29-2010, 05:24 PM
> "Confidence:1" it means that 2 unique users/computers has ripped the track in particular

Not always, if there is only one pressing of a CD then a single submission would be made available.


I hate inconsistency/paradoxes.
They keep me up sleepless nights etc ;(

Right now this statement (quoted above) is what bugs me. I can't seem to mix it with the fact that each track crc submission needs at least one 2nd submit to be moved from raw db -> published db.

So..

Say %DiscID% is not present at all in db to begin with. Now user (for the first time) rips a track and submits it. Then the ripped crc goes straight to published db with Confidence:1, correct?

Spoon
01-29-2010, 06:18 PM
We changed the database to pull all conf:1 as a measure to drop inconsistent results, and stop the possibility of drives keying of bad offset rips.

Gew
01-29-2010, 06:26 PM
Oh, I see.

So..


Say %DiscID% is not present at all in db to begin with. Now user (for the first time) rips a track and submits it. Then the ripped crc goes straight to published db with Confidence:1, correct?

That would then have been the past behaviour.

Whereas now the (1st) ripped crc would go into raw db with..

a) "conf:0" (symbolically speaking) - and then conf:1 in published db when a second user verifies it.

or

b) conf:1 in raw db - and then conf:2 in published db when a second user verifies it.

?

pjc2
01-29-2010, 08:08 PM
On Jan 27:

We changed the database to pull all conf:1 as a measure to drop inconsistent results, and stop the possibility of drives keying of bad offset rips.

When did you make this change?

On Jan 22 you wrote:

> "Confidence:1" it means that 2 unique users/computers has ripped the track in particular

Not always, if there is only one pressing of a CD then a single submission would be made available.

and on Jan 24 you wrote:

A guy rips a disc and gets an incorrect rip as [BBBBBBB] (because of scratch), it submits and this result is in the db. You rip correctly get [AAAAAAAA], submit, now neither results are in the db, 3rd guy rips and submits [AAAAAAAA], now only [AAAAAAAA] appears.

Or are all these the same and we're misunderstanding you?

Spoon
01-30-2010, 06:47 AM
Looking at the latest database code, it will keep Conf: 1 if no other submissions, so I was right in the first instance (the code did drop conf 1 at one time, but the recent rewrite restored it).

Gew
01-30-2010, 07:31 AM
Originally Posted by Spoon
A guy rips a disc and gets an incorrect rip as [BBBBBBB] (because of scratch), it submits and this result is in the db. You rip correctly get [AAAAAAAA], submit, now neither results are in the db, 3rd guy rips and submits [AAAAAAAA], now only [AAAAAAAA] appears.

So is this obsolute information?

And btw, is confidence:number to be considered always reflect the actual number of submissions, eg. no offset +/- 1 so that initial submission is "hang-loose" and second submission go as confidence:1 ie?

Gew
01-30-2010, 01:05 PM
I have this CD. I ripped it and got "Could not be verified [CRC], AccurateRip returned [CRC_in_db]" on all tracks. So, I assume that my pressing was simply not in database. Anyways, after ripping all tracks I submitted them.

So..

Question 1) Will people who rip tracks from this album get my verification crc's with confidence:1 from after approx a month (next pub db update), or will it need a second submission??

Question 2) If there had been no previous pressings in database, eg. I would have gotten "Track is not in database" on every track of the disc mentioned above, would then my single submission had been enough to have users who rip (after ~a month) get OK/confidence:1..?

I have a feeling that having these two questions answered will distinguish satisfactory answers on all counts, even out any question marks, etc.. So, thank you in advance! ;)

Spoon
01-30-2010, 03:17 PM
Try R14 it can check across pressings (in the beta section of this forum).

Gew
01-30-2010, 04:36 PM
Hehe. Well, those examples (see question a+b) were actually just posted to shed some light on how the AR routines really work. To be sure I have accuraterip, I would prolly just re-rip locally and check for matching crc's in log, which would be fulfilling/confident enough for me. But like I said, not the issue, more like, erhm, the questions! :P

pjc2
01-31-2010, 09:14 AM
Looking at the latest database code, it will keep Conf: 1 if no other submissions, so I was right in the first instance (the code did drop conf 1 at one time, but the recent rewrite restored it).
Thanks for clearing that up!

I wanted to make sure I was interpreting my AR results correctly, since the discs where I got Conf: 1 always listed all tracks, and a few discs with Conf:2 omitted some tracks. That seemed to line up with your original explanation.

(I'm glad for the Conf: 1 data, by the way, since it gives me a good deal of confidence for all the tracks that match, and lets me focus my quality efforts on the few that don't.)

Gew
01-31-2010, 04:20 PM
...since the discs where I got Conf: 1 always listed all tracks, and a few discs with Conf:2 omitted some tracks.

Please if you have just a few minutes (and have understood what Spoon means) could you please elaborate on the routine at its whole..? 'cuz his one-liners aren't really cutting the deal to me. Perhaps I'm too dumb. Why is conf:2 disc omitting tracks where conf:1 discs aren't?

I've tried this hypothesis, can you verify it?

Here goes:

A certain DiscID is completely blank in database. No entries in neither raw nor published db. User then rips one or several tracks and submitts it. These goes instantly (let's overlook that ~month that goes before update, for simplicity here) --> published db with conf:1, correct?

Then -- naturally -- for each backing submission on same crcs the conf: is increased by one.


However.

Scenario 2)

A certain DiscID is completely blank in database. No entries in neither raw nor published db. User then rips one or several tracks and submitts it. These goes instantly (let's overlook that ~month that goes before update, for simplicity here) --> published db with conf:1. THEN, another user rips same tracks but with different crcs. This [new_crc] does not not go straight into published db, but to raw db. Also, the [old_crc] is no longer in published db, but revoked and pulled into raw db, where now both of these crcs are awaiting further verification. Correct?

pjc2
02-01-2010, 09:24 AM
As I understand it, I think you've got the directions backwards. Everything goes into the raw DB. Whether it's then promoted to the published DB is the question.


A certain DiscID is completely blank in database. No entries in neither raw nor published db. User then rips one or several tracks and submitts it. These goes instantly (let's overlook that ~month that goes before update, for simplicity here) --> published db with conf:1, correct?
It always goes into the raw DB. Then, if the database update process deems a rip reliable, it publishes the CRC.

So if you get a disc with conf:1, it's just a single user's submission (all ripped tracks), and it may or may not be "accurate" (since it's just a single data point that hasn't been independently confirmed). Where you agree with that single submission, it's likely accurate (which is why conf:1 is useful and why it gets published). Where you disagree with conf:1, there's no telling which of the two of you is right.


These goes instantly ... --> published db with conf:1. THEN, another user rips same tracks but with different crcs. This [new_crc] does not not go straight into published db, but to raw db. Also, the [old_crc] is no longer in published db, but revoked and pulled into raw db, where now both of these crcs are awaiting further verification. Correct?
Again, everything does into the raw DB first.

So in this second case, when the raw DB notices a conflicting CRC, each with conf:1, it can't tell which one is accurate, so it considers them both unreliable until one of them is further confirmed. So at the next update to the published DB, the track with mismatched CRCs won't show up at all. This is why a second user's rip will result in lots of conf:2 (where the CRCs match) and a few not-in-database (where the CRCs didn't match).

But everything is still in the raw DB, so when a third user submits rip results, if his CRC matches one of the two previously submitted, that track's confidence will suddenly jump from 0 (nobody agrees) to 2 (2 independent rips agree). (And all the other tracks would bump from 2 to 3, as expected.)

Gew
02-01-2010, 02:25 PM
Ty alot pjc2, very good explanation.

The part about "..(which is why conf:1 is useful and why it gets published).." was indeed developing my perspective. I was just too narrow-headed to see, I suppose.

One question only, and I will find closure.

Scenario 3) Blank discID, no submissions whatsoever to begin with. User rips conf:1, another user verifies and bumps to conf:2, third, forth, etc. Say that discID/track has got one crc only, and a very well verified for that matter, say conf:10.

Then, another user submits [another_crc]. Will the routines you described earlier still go in the same matter, ie. would this [another_crc] _be made available_ as conf:1 as in the fact that it "may or may not be 'accurate'..", or does the rule that only single conf:1 submissions is made available really go as __when there is only one pressing__.


Boolean this / TLDR / In short:


So if you get a disc with conf:1, it's just a single user's submission (all ripped tracks), and it may or may not be "accurate" (since it's just a single data point that hasn't been independently confirmed). Where you agree with that single submission, it's likely accurate (which is why conf:1 is useful and why it gets published). Where you disagree with conf:1, there's no telling which of the two of you is right.

Will a single submission be published the way described in quote above only if it is the very first (and only) submission for discID/specificTrackNo..?

If the answer to this is Yes then I will finally have found peace :)

pjc2
02-01-2010, 08:56 PM
Will a single submission be published the way described in quote above only if it is the very first (and only) submission for discID/specificTrackNo..?
I believe so -- that one odd submission is "inaccurate" because 10 others agree with each other, so there's no question about who's right. The CRCs remain at conf:10 and the odd submission is ignored by the published DB.

This is, of course, ignoring the multiple pressings issue. I'm not exactly sure how AR handles that, but I suspect a different pressing has different CRCs for all of the tracks. (If some CRCs match, that suggests that it's the known pressing and the mismatched CRCs are indeed errors.)

I think what happens here that (1) AR notices all the CRCs are different and (2) waits for confirmation. I haven't confirmed this, but I suspect the first submission of an alternate pressing is considered unreliable (conf:0) until confirmed by another submission. This is much like the case where the first two submissions disagree -- all the tracks on the alternate pressing are considered conf:0. (However, the tracks on the known good pressing would retain their confidence levels.) Once a second submission comes in to confirm the alternate pressing, the alternate pressing's tracks bump up to conf:2. This is all conjecture at this point, but it follows from the AR design Spoon has described.

I don't yet fully understand how the R14 alternate-pressing-verification system works. I hope to figure that out in the coming weeks.

Gew
02-02-2010, 06:28 PM
Ty again, so very much!

I think I finally got the entire concept, brick by brick. The way you describe a track's sudden bump from "Track not found in database" (eg. what you refer to as 'conf:0') to confidence:2 is brilliant, it leaves little room for misunderstanding of the actual routine.

I believe what got me confuzed in the first place was this:


> "Confidence:1" it means that 2 unique users/computers has ripped the track in particular

Not always, if there is only one pressing of a CD then a single submission would be made available.

I believe Spoon misunderstood me. I was asking if (for instance) "confidence:5" could mean that 6 (and so on) unique users hade submitted the track crc, eg. that the "count offset" was 1 to begin with. Thus when his answer contained "Not always" I was still wondering on in which scenario the count offset was really 1 so that ie. conf:2 ment that 3 different users had gotten the result, which is really _never_ the case. This sort made my whole perspective a bit fuzzy there, just as I didn't have enough pieces of the puzzle to put in order! :)

Anyways.

Golden rule is that one discID/trackNo will have one chance only on getting "the benefit of the doubt", and this is when its first crc is submitted. This one checksum goes directly (after update) to published db as conf:1, whereas it gets "pulled back" as soon as any other/conflicting crc's occur on the specific discID/trackNo.

Furthermore..

When I think deep and hard on all this, the -- as you call it -- "multiple pressing issue" is really naturally solved this way. In easy terms, it starts with a conflicting crc, which is -- at its first submit -- pulled back to raw db / not published, but as soon as another user submits it, it goes in a sort of own thread or channel on the same discID/trackNo.

@pjc2,
Please let us know when you've figured out the new -- R14isch -- system, and feel free to publish the routines in the easy language you've done so far. You've been great help. Also Spoon has been of much great help, disregarding that one quote high above that got me dazed & confuzed! ;)


Regards~
Over 'n out~