Product Feature Optimization With Constraints
I have trained a Lightgbm model on learning to rank dataset. The model predicts relevance score of a sample. So higher the prediction the better it is. Now that the model has learn
Solution 1:
It's been a good minute since I last wrote some serious code, so I appologize if it's not entirely clear what everything does, please feel free to ask for more explanations
The imports:
from sklearn.ensembleimportGradientBoostingRegressorimport numpy as np
from scipy.optimizeimport minimize
from copy import copy
First I define a new class that allows me to easily redefine values. This class has 5 inputs:
- value: this is the 'base' value. In your equation
y=Ax + b
it's theb
part - minimum: this is the minimum value this type will evaluate as
- maximum: this is the maximum value this type will evaluate as
- multipliers: the first tricky one. It's a list of other InputType objects. The first is the input type and the second the multiplier. In your example
y=Ax +b
you would have[[x, A]]
, if the equation wasy=Ax + Bz + Cd
it would be[[x, A], [z, B], [d, C]]
- relations: the most tricky one. It's also a list of other InputType objects, it has four items: the first is the input type, the second defines if it's an upper boundary you use
min
, if it's a lower boundary you usemax
. The third item in the list is the value of the boundary, and the fourth the output value connected to it
Watch out if you define your input values too strangely I'm sure there's weird behaviour.
classInputType:
def__init__(self, value=0, minimum=-1e99, maximum=1e99, multipliers=[], relations=[]):
"""
:param float value: base value
:param float minimum: value can never be lower than x
:param float maximum: value can never be higher than y
:param multipliers: [[InputType, multiplier], [InputType, multiplier]]
:param relations: [[InputType, min, threshold, output_value], [InputType, max, threshold, output_value]]
"""
self.val = value
self.min = minimum
self.max = maximum
self.multipliers = multipliers
self.relations = relations
defreset_val(self, value):
self.val = value
defevaluate(self):
"""
- relations to other variables are done first if there are none then the rest is evaluated
- at most self.max
- at least self.min
- self.val + i_x * w_x
i_x is input i, w_x is multiplier (weight) of i
"""for term, min_max, value, output_value in self.relations:
# check for each term if it falls outside of the expected termsif min_max(term.evaluate(), value) != term.evaluate():
return self.return_value(output_value)
output_value = self.val + sum([i[0].evaluate() * i[1] for i in self.multipliers])
return self.return_value(output_value)
defreturn_value(self, output_value):
returnmin(self.max, max(self.min, output_value))
Using this, you can fix the input types sent from the optimizer, as shown in _call_model
:
classExample:
def__init__(self, lst_args):
self.lst_args = lst_args
self.X = np.random.random((10000, len(lst_args)))
self.y = self.get_y()
self.clf = GradientBoostingRegressor()
self.fit()
defget_y(self):
# sum of squares, is minimum at x = [0, 0, 0, 0, 0 ... ]return np.array([[self._func(i)] for i in self.X])
def_func(self, i):
returnsum(i * i)
deffit(self):
self.clf.fit(self.X, self.y)
defoptimize(self):
x0 = [0.5for i in self.lst_args]
initial_simplex = self._get_simplex(x0, 0.1)
result = minimize(fun=self._call_model,
x0=np.array(x0),
method='Nelder-Mead',
options={'xatol': 0.1,
'initial_simplex': np.array(initial_simplex)})
return result
def_get_simplex(self, x0, step):
simplex = []
for i inrange(len(x0)):
point = copy(x0)
point[i] -= step
simplex.append(point)
point2 = copy(x0)
point2[-1] += step
simplex.append(point2)
return simplex
def_call_model(self, x):
print(x, type(x))
for i, value inenumerate(x):
self.lst_args[i].reset_val(value)
input_x = np.array([i.evaluate() for i in self.lst_args])
prediction = self.clf.predict([input_x])
return prediction[0]
I can define your problem as shown below (be sure to define the inputs in the same order as the final list, otherwise not all the values will get updated correctly in the optimizer!):
A = 5b = 2thresh_a = 5thresh_b = 10thresh_c = 10.1thresh_m = 4thresh_n = 6u = InputType()
v = InputType()
w = InputType()
x = InputType(minimum=thresh_m, maximum=thresh_n)
y = InputType(value = b, multipliers=([[x, A]]))
z = InputType(relations=[[y, max, thresh_a, 4], [y, min, thresh_b, 3.5], [y, max, thresh_c, 3.7]])
example = Example([u, v, w, x, y, z])
Calling the results:
result = example.optimize()
for i, value inenumerate(result.x):
example.lst_args[i].reset_val(value)
print(f"final values are at: {[i.evaluate() for i in example.lst_args]}: {result.fun)}")
Post a Comment for "Product Feature Optimization With Constraints"