Home

Geregistreerd: donderdag 18 oktober 2012 Berichten: 13

You have a magnificent skill to ignore everything I'm saying to you or then you just don't understand the problem area here.

I can only ask again, why do you have an API method which you don't want to be called, which you said is heavy, which you said is pointless to call, which always returns the same result, which should be cached here and there.

If you don't understand what I'm trying to say and still believe that the caching you are constantly talking about over and over again, is the only right solution for your problem along with restrictions, then good luck. I honestly believe I'm talking to a wrong person here about these issues and rest my case.

Quote:

seco schreef op vrijdag 19 oktober 2012 @ 11:27:
You have a magnificent skill to ignore everything I'm saying to you or then you just don't understand the problem area here.
I can only ask again, why do you have an API method which you don't want to be called, which you said is heavy, which you said is pointless to call, which always returns the same result, which should be cached here and there.
If you don't understand what I'm trying to say and still believe that the caching you are constantly talking about over and over again, is the only right solution for your problem along with restrictions, then good luck. I honestly believe I'm talking to a wrong person here about these issues and rest my case.

He's not saying not to use the FindShowByName method at all. However calls to this method should be limited and it should be fairly simply to do so. If you have access to the TVDB ID in your application (I think that should be possible in for example MediaPortal) you won't need the FindShowByName method at all as you can supply that ID to GetAllSubsFor. When you do not have that ID yet, you should be able to fetch the ID once and cache it in a file, or memory for the current set of subtitles you're trying to fetch if a file is not an option. This should limit the amount of calls to FindShowByName to once per set of subtitles you're trying to get for a certain show at a certain moment at most.

The problem with supplying a show by name to GetAllSubsFor is that the show's name doesn't always exactly match the value in our database. What should we do with non-exact matches? Use the best match? Simply deny the request? This would probably cause unexpected replies from GetAllSubsFor, when simply supplying an ID you're certain you're getting the subtitles for the show you requested. If you don't agree with this I'm wondering how you would solve the problem of non-exact matches.

Geregistreerd: donderdag 18 oktober 2012 Berichten: 13

Look, I really already know how your current API works and how it should be used so there is no need for repeating that.

I fail to see the problem with GetAllSubsFor using the show name since FindShowByName is already using that and apparently has the logic implemented.

Quote:

seco schreef op vrijdag 19 oktober 2012 @ 12:04:
Look, I really already know how your current API works and how it should be used so there is no need for repeating that.
I fail to see the problem with GetAllSubsFor using the show name since FindShowByName is already using that and apparently has the logic implemented.

I know that, but you're still not answering my question of what to do with non-exact matches of a showname in GetAllSubsFor.

Geregistreerd: donderdag 18 oktober 2012 Berichten: 13

What you have:

Code (php):

1
2
3
4
5
6
7
function FindShowByName(showName) { return WhatEverLogicIsHere(showName); } function GetAllSubsFor(showId) { return WhatEverLogicIsHere(showId); }

What I currently do:

Code (php):

1
2
3
4
5
var shows = FindShowByName(showName); for (show in shows) { subs = GetAllSubsFor(show.Id); }

What you could have:

Code (php):

1
2
3
4
5
6
7
8
9
10
11
12
function GetAllSubsFor(showName) { // Optimize/cache/whatever this implementation as you like var shows = FindShowByName(showName); // Optimize/cache/whatever this implementation as you like for (show in shows) { subs = GetAllSubsFor(show.Id); } return subs; }

What I would do:

Code (php):

1
var subs = GetAllSubsFor(showName);

Geregistreerd: donderdag 18 oktober 2012 Berichten: 13

Quote:

I know that, but you're still not answering my question of what to do with non-exact matches of a showname in GetAllSubsFor.

I don't have the answer, it's in your FindShowByName implementation.

seco wijzigde dit bericht op 19-10-2012 om 10:17, totaal 1 keer bewerkt

This is actually a perfect example of what would be really weird behaviour.

FindShowByName returns multiple shows based on your input which can be a partial match. It would be really weird to start fetching the subtitles for all of those returned shows, as there's probably only one you're looking for. Using FindShowByName your application or end-user can decide which show that is and then start fetching the subtitles. If we would allow supplying the showname to GetAllSubsFor we're not going to return the subtitles for all partial matches, that'd be ridiculous, so that'd mean we'd have to decide which show you're look for (best guess) instead of you deciding which one.

Geregistreerd: donderdag 18 oktober 2012 Berichten: 13

Well, like I said I don't know how your FindShowByName is implemented. For what I suggested the best option would be "one best guess" instead of "all partial matches".

All other sites have managed to do the title matching but your approach seems to be handing all the difficult stuff for end-users & clients instead solving the problems there where they should be solved.

And yes, other sites sometimes return irrelevant results but still all the results are in order by relevancy (best match first)

Quote:

seco schreef op vrijdag 19 oktober 2012 @ 12:29:
Well, like I said I don't know how your FindShowByName is implemented. For what I suggested the best option would be "one best guess" instead of "all partial matches".
All other sites have managed to do the title matching but your approach seems to be handing all the difficult stuff for end-users & clients instead solving the problems there where they should be solved.
And yes, other sites sometimes return irrelevant results but still all the results are in order by relevancy (best match first)

All partial matches is not an option. As there are many matches it'd useless returning all of those subtitles. We're perfectly able to give you a best match however that means you're stuck with that one and not able to request any of the other partial methods unless we create a new method.

Geregistreerd: donderdag 18 oktober 2012 Berichten: 13

I didn't suggest all partial matches but was saying that I believe it is currently how it works. How many "best" matches you use is up to you.

Oh my god. Why would you even want to have this discussion.

There is no reason AT ALL to have a GetAllSubsFor(showName). It's useless to return all the subs for all the shows that get returned by this method. And one 'best-fit' result can be good for one user, but not for the other.

Just use the ID's of the show. You are certain that it will return the exact show and you don't burn their servers with unnecessary requests.

Geregistreerd: donderdag 18 oktober 2012 Berichten: 13

Quote:

blaitem schreef op vrijdag 19 oktober 2012 @ 13:10:
Oh my god. Why would you even want to have this discussion.
There is no reason AT ALL to have a GetAllSubsFor(showName). It's useless to return all the subs for all the shows that get returned by this method. And one 'best-fit' result can be good for one user, but not for the other.
Just use the ID's of the show. You are certain that it will return the exact show and you don't burn their servers with unnecessary requests.

No one has suggested a method that would return all subs for all shows so hold your horses.

What I originally said (first page of this thread):

"From what I'm looking here is that there is and never has been a way to search subtitles by making a single query like Search(showname, seasonnumber, episodenumber)."

When I say GetAllSubsFor(showName) I mean GetAllSubsFor(showName, seasonNumber, episodeNumber) and the logic for finding the show is described above ("best" match).

Geregistreerd: zaterdag 28 november 2009 Berichten: 4

Um about the FindShowByName queries, why don't you put a HTTP cache daemon in front of the api? That should really lower the load on your servers...

But that's the point. The 'best match' method is useless. A person may look for 'Friends' but he actually means 'friends with benefits'. He will get the subs for the show 'Friends', but not for his 'requested' show. (maybe a silly example, but there are cases where the best-fit will have strange behavior)

Therefor, you will first have to search for all the shows with 'friends' in it and propose these results to your user. And with this result you can do a GetAllSubsFor(show.Id). Then of course, just save the ID of the requested show in your datastore and you're good to go. I really don't see the problem with that approach...

And also: The logic for the best-fit method will probably be far more exhausting for the server/backend than an indexed search on an ID.

Geregistreerd: donderdag 18 oktober 2012 Berichten: 13

Quote:

blaitem schreef op vrijdag 19 oktober 2012 @ 15:16:
But that's the point. The 'best match' method is useless. A person may look for 'Friends' but he actually means 'friends with benefits'. He will get the subs for the show 'Friends', but not for his 'requested' show. (maybe a silly example, but there are cases where the best-fit will have strange behavior)
Therefor, you will first have to search for all the shows with 'friends' in it and propose these results to your user. And with this result you can do a GetAllSubsFor(show.Id). Then of course, just save the ID of the requested show in your datastore and you're good to go. I really don't see the problem with that approach...
And also: The logic for the best-fit method will probably be far more exhausting for the server/backend than an indexed search on an ID.

You are telling me that you know better my end-user requirements & API requirements than I do? Please don't do that. Also your comment about server/backend exhausting clearly indicates that you don't have the technical competence to address this issue...

Geregistreerd: zondag 24 september 2006 Berichten: 36

Totally agree with Sypher, its not that hard to store the showid in a simple database or a text file. It's a free service, why cant you do some of the work yourself?

Geregistreerd: zaterdag 21 november 2009 Berichten: 26

Well it seems a little strange these days that its not allowed to get a showid based on a name and within 1 second request the wanted subtitle. yes one could cache but still there is no way to get a showid for any new show and directly request a subtitle.

Why else would one have a call "get showid by name" when you could not use the id immediately afterwards to collect the sub-link

A cache option could be nice but i imagin that for a library function developer its somewhat difficult to implement because there is no way that the developer knows who and in what environment his library is used.

I'd say that a only a limit on calls could be sufficient. If 1 ip can do max 250 (or 150?) no matter how it is used its up to the user how to spend them. If a program uses half of the calls to useless get id calls well less subs to get in a day. Wouldnt that get the load on the api down enough?

Its just my 2 ct's

(Mochten er mensen zijn die bovenstaande vertaald willen hebben hoor ik het wel(in case that translation is needed just ask))

Quote:

Galaphile schreef op vrijdag 19 oktober 2012 @ 21:31:
Well it seems a little strange these days that its not allowed to get a showid based on a name and within 1 second request the wanted subtitle. yes one could cache but still there is no way to get a showid for any new show and directly request a subtitle.

De beperking kan net zo goed 3 requests per seconde zijn, maar het kan nooit kwaad dat een client zich gewoon gedraagt en niet domweg gaat hameren omdat de server snel kan reageren.

Quote:

Why else would one have a call "get showid by name" when you could not use the id immediately afterwards to collect the sub-link

Dat kan dus wel, want de beperking was 2 requests per seconde. Ga je er meer doen, zou je tegen de "doe eens rustig" aan lopen

Quote:

A cache option could be nice but i imagin that for a library function developer its somewhat difficult to implement because there is no way that the developer knows who and in what environment his library is used.

Als het een library is kan deze toch ook caching aanbieden? En anders kan de eindgebruiker dat zelf doen als die een sociale API consumer wil zijn tenminste.

Quote:

I'd say that a only a limit on calls could be sufficient. If 1 ip can do max 250 (or 150?) no matter how it is used its up to the user how to spend them. If a program uses half of the calls to useless get id calls well less subs to get in a day. Wouldnt that get the load on the api down enough?
Its just my 2 ct's

Als we geen limiet per seconde doen, kan het zijn dat je in een minuut of 2-5 door je requests heen bent als de API client zich niet kan beheersen qua aantal gelijktijdige/opvolgende aanvragen.

Vind het best om het limiet op 250 per dag te zetten en dan aan de developer te laten of ze sociaal zijn tegenover de API en hun eindgebruikers en zelf al caching en een vorm van rate limiting invoeren.

Maar ik heb een ander idee wat misschien nog beter is, aangezien deze geen errors teruggeeft als limieten zijn bereikt en dus geen wijzigingen aan de code vereist.

Het idee is om de eerste X requests per uur of dag gewoon te houden, ga je daar over heen (dus: poll je te vaak, is de applicatie niet sociaal) worden de API antwoorden beperkt in snelheid.

Ik denk dat dit "the way to go is". Het levert een win-win situatie op.

We voorkomen een stortvloed aan gelijktijdige connecties (na dat punt is bereikt tenminste).
Sociale API clients zullen er dus minder/geen last van hebben dan niet-sociale API clients.
Ben je efficiÃ«nt en haal je alleen op wat echt nodig is (GetAllSubsFor bijv) dan kan je meer API calls uitvoeren dan als je voor elke ondertitel 2 calls moet doen (FindShowByname + GetAllSubsFor) waarmee je dus feitelijk de helft aan "snelle" calls over hebt.

Nogmaals: dit is puur een idee. Aangezien het in mijn optiek een win-win situatie oplevert denk ik dat dat toch wel de beste methode zal zijn.

Geregistreerd: dinsdag 23 december 2008 Berichten: 9

Is Xbmc Subtitles ook 1 van de probleemgevallen? Deze doet volgens mij wel aan caching, echter zoeken ze niet in eerste instantie op tvdbid maar op naam.

Edit: zie net een reactie op tweakers dat dit gaat veranderen.

amebor wijzigde dit bericht op 19-10-2012 om 20:16, totaal 1 keer bewerkt

Quote:

Sypher schreef op vrijdag 19 oktober 2012 @ 11:02:

Take this for instance:
Client: Whats the ID for Lost?
API: Thats 5517 (BDID) or 73739 (thetvdb)
Client: Can I get the subs for show 5517/73939 season 3 episode 6?
API: Here is the link
Nice, I have received the subs for this episode but I also want them for 7 since so lets get it..
Client: Whats the ID for Lost?
API: Thats 5517 (BDID) or 73739 (thetvdb)
Client: Can I get the subs for show 5517/73939 season 3 episode 7?
API: Here is the link
The red part is repeated and thus pointless, this may happen up to 24 times if a whole season is being scraped or hundreds of calls if the whole Lost series is being scraped or a whole tv series library.
Why is it so hard for an API client to simply behave?
Why would it have to make 10+ connections in one second simply because the server is - at that moment - capable of handling it?
Why would it need to query the very same data time and time again even if it will never change?
Even if you cannot store it permanently, it could be stored in memory for subsequent calls for the same show. That would still save some calls.
ID's are static and will not change, there should be no reason for any API client to query this again after this has been done once. Its a waste of resources on the client and server side.

This explains merely everything and I think that applications should do that, however I do not have any technical experience.

Quote:

seco schreef op vrijdag 19 oktober 2012 @ 17:20:

Quote:

blaitem schreef op vrijdag 19 oktober 2012 @ 15:16:
But that's the point. The 'best match' method is useless. A person may look for 'Friends' but he actually means 'friends with benefits'. He will get the subs for the show 'Friends', but not for his 'requested' show. (maybe a silly example, but there are cases where the best-fit will have strange behavior)
Therefor, you will first have to search for all the shows with 'friends' in it and propose these results to your user. And with this result you can do a GetAllSubsFor(show.Id). Then of course, just save the ID of the requested show in your datastore and you're good to go. I really don't see the problem with that approach...
And also: The logic for the best-fit method will probably be far more exhausting for the server/backend than an indexed search on an ID.

You are telling me that you know better my end-user requirements & API requirements than I do? Please don't do that. Also your comment about server/backend exhausting clearly indicates that you don't have the technical competence to address this issue...

Lol

Geregistreerd: zaterdag 28 november 2009 Berichten: 4

Quote:

Het idee is om de eerste X requests per uur of dag gewoon te houden, ga je daar over heen (dus: poll je te vaak, is de applicatie niet sociaal) worden de API antwoorden beperkt in snelheid.

Ik kan hier prima mee leven! Ik hoop dat het voor jullie afdoende is.

Ik zou toch ook serieus kijken naar een cache server. Die kan dan prima onthouden dat een bepaalde GetShowByName een bepaald antwoord heeft wat niet zal verlopen, met paar wildcards kan je API key negeren en de caching server antwoord laten geven zonder dat het ook maar een cpu cycle kost aan de back-end...

We hebben al een caching server voor de hele webstack draaien maar als die ook API responses moet gaan cachen maakt dat de configuratie wel een stuk complexer (en meer memory intensive)

De key valt niet altijd de negeren want die is soms nodig voor vervolgscties (denk aan daadwerkelijke download).

Dat voorkomt inderdaad wel onnodige overlast aan de backend, maar waarom moeten wij ons aanpassen als developers zich niet aan de regels willen houden en hun app geen manieren aanbrengt (niet hameren en geen herhaalde identieke requests)?

Geregistreerd: donderdag 18 oktober 2012 Berichten: 13

I have now implemented support for searching by TVDB ID which leads to only one request by using GetAllSubsFor with istvdbid flag true.

As a fallback, if the TVDB ID is unknown or not passed to my implementation, behavior is what I've described earlier, but the show IDs are cached in memory for future use. So the new implementation is backwards compatible because of the fallback and performs better in both cases.

What I noticed that FindShowByName seems to (me) return quite a lot ofirrelevant results, for example "heroes": "Heroes, "Haven", "Outcasts", "Kings" so maybe this is something you want to take a look at your side.

Thanks for improving your app/script.

FindShowByName uses the same search method as the website, which also includes the summary IIRC. It would indeed be better not to do that for the API, we'll look into improving that.

Home

BierdopjeV3 Alpha nu beschikbaar. Wil je helpen? Kijk dan hier

[Feedback] API Throttling/rate limit