Calculate how many queries you'd have to make to fully explore a deterministic recommendation space?
Observe, there are 12 classes of 'interests', Four 0-10 sliders, and each request responds with 6 recommended clubs.
You can't click an interest twice, and it appears you need to have at least one for the quiz to continue.
So we can have up to 11 null entries in a request like this
'X'+'-'*11
'X-----------'
So, there are 12 possible starting positions for 'interests' And for each of those there are then 11 remaining possible positions for the next 'interest'....
And so on
So that's $12 * 11 * 10 * 9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1$ or
from math import factorial #this is python, not code golf
factorial(12)
479001600
assuming each query takes a second, that's
from datetime import timedelta
def td_fmt(td):
# no leap years, sue me
return {
'years':td.days//365,
'days':td.days%365,
'hours':td.seconds//3600,
'seconds':(td.seconds//60)%60
}
interest_total = factorial(12)
td_fmt(interest_total * timedelta(seconds=1))
{'years': 15, 'days': 69, 'hours': 0, 'seconds': 0}
But we're not done yet; we still have a 4-dimensional manifold space (i.e. the four sliders) to enumerate!
This ones relaively easy as we always submit an answer and there's none of this factorial business; we have a 'hypercube' of side 10, easy.
slider_total = 10**4
print(slider_total)
td_fmt(slider_total * timedelta(seconds=1))
10000
{'years': 0, 'days': 0, 'hours': 2, 'seconds': 46}
And bringing that together with our 'interest' calculation; it's not a case of adding these query spaces together; to fully explore the combined manifold defined by the 12 interest permutations and the 4 sliders, we need;
total = factorial(12) * 10 **4
total,td_fmt(total*relativedelta(seconds=1))
(4790016000000, {'years': 151890, 'days': 150, 'hours': 0, 'seconds': 0})
At this point it should be clear that exhaustive searching is difficult to say the least.
But wait! I hear you say 'multiprocessing' and 'cloud scale' and 'distributed computing'!
Ok, great! Well, now what?
import requests
from bs4 import BeautifulSoup
base = 'https://hookup-qubsu.org/home/GetResults'
static_query = {'Categories': ['Democracy'],
'Budget': '5',
'Time': '5',
'Travel': '5',
'Joined': '5'
}
def get_clubs(q):
response = requests.post(base, data=q)
content = response.content
duration = response.elapsed.total_seconds()
s = BeautifulSoup(content, 'html.parser')
clubs = [h.get_text() for h in s.select('div.answers > h2')]
return clubs, duration
for i in range(5):
print(set(get_clubs(static_query)[0]))
{'Mind Matters Society', 'Martial Arts & Combat Sports Clubs', 'Musical Theatre Society', 'Volleyball Club', 'Visual Arts Society', 'Dance Club'} {'Belfast Marrow Society', 'Volleyball Club', 'Volunteering', 'Nightline Society', 'Homework Clubs', 'Music Society'} {'Inspiring Leaders', 'Belfast Marrow Society', 'Musical Theatre Society', 'Amnesty', 'Volunteering', "Writers' Society"} {'Traditional Crafts Society', 'Trócaire Society', 'GAA Clubs ', 'Unihoc-Floorball Club', 'Cheerleading Club', 'Badminton Club'} {'Photography Society', 'Medical Societies', 'University Air Squadron Society', 'Robotics Society', 'Players Society', 'Rugby Club'}
Exhaustively assessing this 'model' would be ludicrous because it already appears to be 'probabilistic' or at least 'non-deterministic'
We have to be more clever than that...